-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathdraft-rosenberg-dispatch-ripp-inbound.xml
423 lines (338 loc) · 17.5 KB
/
draft-rosenberg-dispatch-ripp-inbound.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<rfc category="std" submissionType="IETF" ipr="trust200902" docName="draft-rosenberg-dispatch-ripp-inbound-00">
<front>
<title abbrev="RIPP Inbound">RIPP for Single User Endpoints</title>
<author fullname="Jonathan Rosenberg" initials="J.R." role="editor"
surname="Rosenberg">
<organization abbrev="Five9">Five9</organization>
<address>
<postal>
<street>4000 Executive Parkway #400</street>
<city>San Ramon</city>
<region>CA</region>
<code>94583</code>
<country>US</country>
</postal>
<email>jdrosen@jdrosen.net</email>
</address>
</author>
<date month="January" year="2020" day="27"/>
<area>Applications</area>
<abstract>
<t>The Real-Time Internet Peering Protocol (RIPP) defines a
technique for establishing, terminating and otherwise managing
calls between entities in differing administrative
domains. While it can be used for single user devices like an IP
phone, it requires the IP phone to have TLS certificates and be
publically reachable with a DNS record. This specification
remedies this by extending RIPP to enable clients to receive
inbound calls through their outbound connection to the
server. It also provides basic single-user features such as
forking, call push and pull, third-party call controls, and call
appearances. It describes techniques for resiliency of calls,
especially for mobile clients with spotty network connectivity. </t>
</abstract>
</front>
<middle>
<section title="Terminology">
<t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in BCP
14 <xref target="RFC2119"/> <xref target="RFC8174"/> when, and only
when, they appear in all capitals, as shown here. </t>
</section>
<section title="Introduction">
<t>The Real-Time Internet Peering Protocol (RIPP) defines a
technique for establishing, terminating and otherwise managing calls
between entities in differing administrative domains. It is an
application ontop of HTTP/3, and as such has the notion of a client
that opens connections and makes requests to a server. In the core
RIPP specification, clients can only place outbound calls. Inbound
calls are supported by requiring an entity to also run a
server. </t>
<t>While this requirement is appropriate for use cases like SIP
trunking, carrier to carrier peering, or other arrangements
involving a large number of calls, it is a poor match for single
user devices. A single user device is one in which an actual end
user would log in and use that device for making and receiving
calls. Exampes include desktop softphones, browser based webRTC
appications, IP hardphones, and video conferencing endpoints. These
devices are often behind a NAT, dont have DNS names, and don't have
TLS certificates, all of which are pre-requisiites to run a
server. </t>
<t>Furthermore, an end user may often be logged into multiple such
devices, possibly from multiple locations. This introduces
additional requirements. Inbound calls need to be forked to all
devices, and ring on all of them. A user must be able to answer on
one, and stop ringing on the others. SIP <xref target="RFC3261"/>
natively supported these capabilities. However, it lacked other ones
which are clearly needed - native support for mobile-based apps
which utilize push notifications is one significant example. </t>
<t>SIP's lack of call state in servers as a built-in feature of the
protocol has also meant it couldn't readily support other features
truly needed for a system where a user can be logged into multiple
devices. These include the ability for one device to see the state
of the call, and know on which other device the call is being
handled. Another important feature includes the ability to - from
any device - end the call, move it to a different device, or on the
device the user is sitting on. It also includes basic third party
call controls - the ability to initiate or answer a call from one
client, but have the media delivered to another. </t>
<t>To rememdy these challenges this specification provides an
extension to RIPP to facilitate multi-user devices. </t>
</section>
<section title="Overview of Operation">
<t>The actual protocol enhancement to RIPP is small. The primary one
is enabling a server to inform the client of an inbound call, and to
add additional events and supported event directions. The second one
is additional behaviors required of clients around handlers.</t>
<t>To signal usage of this specification, the server includes a new
element, "inbound", in its TG description. The format of this
element is identical to what it would look like to receive calls on
a TG that would have been hosted by the single-user device, had it
been able to do so. For example, the following TG describes a single
user TG which can handle both outbound and inbound calls:</t>
<figure>
<artwork><![CDATA[
{
"outbound": {
"origins" : "+14085551002",
"destinations" : "*"
},
"inbound": {
"destinations" : "+14085551002",
"origins" : "*"
}
}
]]>
</artwork>
</figure>
<t>The client will follow RIPP procedures for handler
registration. This is analagous to the SIP REGISTER operation. For
server to server peering arrangements, the handler represents a
particular collection of capabilities on an SBC. When used by
single-user devices, it represents each individual
device. Consequently, if a user has four IP phones, there would be
four handlers created on the server. As specified in RIPP each
client needs to remember its handler URI persistently in order to
modify it or delete it later on. </t>
<t>Unlike SIP, wherein a client frequently refreshes its registrtion
because it is soft-state, a RIPP client doesn't do this. Handlers
are not soft-state, they are hard-state. A server can remove
handler at will of course, but it is the responsibility of the
client, through events or a query to its own handler URI, to learn
of this and create a new one it if needed. Under normal
conditions however, the server won't ever remove them. This is
because receipt of inbound calls not tied to the existence of a
handler, which is a substantially different model than used by
SIP. Think of handlers like an inventory of a user's devices, solely
for the purpose of an end user selecting amongst them when making or
answering a call. As such, a server might delete one after extended
inactivity (say, one month). But this is not necessary except to
improve the user experience by eliminating old devices when a device
list is too long. </t>
<t>Unlike SIP-outbound, receipt on inbound calls doesnt rely on a
client having a persistent connection to receive a call. RIPP does
require a persistent connection for timely delivery of call events,
but it does not have a functional dependency on it. For example,
consider a client which lost its connection to the server for a few
seconds, perhaps due to a fade-out in wireless connectivity. If an
incoming call arrives at precisely that moment, the client wont
receive its event. However, upon recovery of network connectivity,
the client re-establishes its connection to the /events resource on
the TG, performs the long-lived GET for events, receives an event
about the calls in progress, including a new one that it is not
aware of. This event includes the URI for the call. It queries the
server, learns about the call, sees its state is "proceeding" and
direction is "inbound". This tells the client to render the incoming
call dialog. It can then answer and proceed with the call as
normal. Indeed, the client can be out of network connectivity for as
long as the caller is willing to wait for the call to be
answered. This makes RIPP quite resilient to wireless network
hiccups, network handoffs, and other network challenges which have
become commonplace. </t>
<t>In a similar way, the highly stateful nature of the RIPP server
means that RIPP is also more resilient to mid-call drops in
connectivity. A client which loses its network connectivity will, of
course, fail to receive media, and its peer won't receive any
either. To explicitly inform users of this, this specification
introduces a new event, "reconnecting", which informs the client
that its peer - which can be RIPP or SIP-based - has lost
connectivity and reconnect is in progress. There is also a
"reconnected" event that is sent to indicate reconnection. A server
can determine when to send this based on any policy it so
desires. It is used by the client solely for UI purposes. It
encourages the end user to wait rather than hanging up. This allows
the remote user to recover their network connectivity, or perhaps
pull the call to another device. So, for example, a user might be on
their call on mobile, lose connectivity when entering their
home. They go to their wired PC, which will know about the call, and
be able to hit a "pull call here" button or similar, and proceed
with the call. </t>
<t>The support of inbound calls on the TG means that the client can
learn about incoming calls by performing a GET to the /events
resource on the TG URI, as described in the core RIPP
specification. This specification adds a new event, "call", which
includes, as a parameter, the URI for the call that is now available
to the client. The server will generate this event for both inbound
and outbound calls, allowing any client that is retrieving these
events to know about new calls.</t>
<t>In order to receive inbound calls continuously, the client
maintains its long-lived GET request to the /events resource. Of
course, the client can learn about a new call through other means,
such as through mobile push notifications. These need only deliver
the call URI to the client, who can then query it and proceed from
there.</t>
<t>Once the client learns of this new call, it performs a GET on the
call URI. This will provide the client with the call direction (in
this case, inbound), calling party, destination number
and most recent event (in this case, proceeding). This allows the
client to render UI. </t>
<t>With call state fetched, the client follows the procedures
defined in RIPP, and establishes the signaling and media byways as
normal. Once it starts alerting the user, it generates an "alerting"
event towards the server. If it wishes to accept the call, it
generates an "answered" event towards the server. The "answered"
event, when sent by the client, includes a handler URI, which
specifies the device on which the client would like the media to be
rendered. Normally this will be itself. </t>
<t>However, the client can specify a different handler. Any
authorized client can learn about the handlers for the TG by
querying the /handlers endpoint, which returns a list of them. The
client can then iterate through them and request their details. It can
also receive updates to the set of handlers using the /events
resource on the TG. Specifying a different handler enables
third-party call control operations. The image, device nickname and
other attributes on the handler allow an end user to usefully select
the device on which to receive the audio and/or video. </t>
<t>In the case where, for example, a user answers a call and
specifies a handler different from itself, the device associated
with that handler needs to know about this. It will learn about this
through its signaling byway to the call. It will see an "answered"
event for a call, and then an "attaching" event with the handler
URI. When the client sees that the handler URI is itself, this tells
it that it must render media. To do so, it queries the call URI,
which will return to it the directive to use. It also begins to
accept media from the server.
</t>
</section>
<section title="Example Use Cases">
<t>This section outlines example use cases that are enabled by this
specification. It is not normative in nature. It merely describes
how the new API features defined by this specification can be used
by clients to deal with these cases.</t>
<section title="Inbound Call Forking">
</section>
<section title="Answer and Stop Ringing Other Devices">
</section>
<section title="Remote in Use">
</section>
<section title="Call Pull">
</section>
<section title="Call Push">
</section>
<section title="Select Device">
</section>
<section title="3PCC: Place Outbound">
</section>
<section title="3PCC: Answer or Decline Inbound">
</section>
<section title="3PCC: Hangup">
</section>
<section title="3PCC: Move Call">
</section>
<section title="Resiliency: Miss Incoming call">
</section>
<section title="Resiliency: Mid-Call Network Change">
</section>
<section title="Resiliency: Mid-Call Wireless Fade and Recover">
</section>
<section title="Resiliency: Mid-Call Wireless Fade and Move">
</section>
<section title="Resiliency: Mid-Call Wireless Fade and Peer Hangup">
</section>
<section title="Resiliency: Mid-Call Wireless Fade and Server Drop">
</section>
</section>
<section title="Normative Protocol Specification">
<t>A server that supports inbound calls on its TG MUST include the
"inbound" element in its TG description. This MUST include the
allowed caller IDs in the "origins" element, and the allowed
destinations in the "destinations". </t>
<t>When a server receives a new inbound call, or places a new
outbound call, it MUST send the "call" event to any clients which
have a GET query in progress to its TG. This event MUST include the
call URI associated with the call that the server is aware of. The
server can also utilize push notifications or other techniques
outside the scope of this speification to alert clients to new
calls. </t>
<t>The server MUST allow the client to send a "proceeding",
"alerting", "answered", "declined", "failed", "noanswer" and "end"
events, and take the associated actions on the call.</t>
<t>A client that answers a call MUST include, as an attribute to the
"answered" event, the handler that is to be used for the call. This
handler MUST be active and registered to the TG against which the
call was received. The server MUST verify this, and if it is not,
generate a "handler-invalid" event, with the handler URI from the
request as a parameter. In this case, the server MUST NOT consider
the call answered. If the handler URI was valid, the server MUST
construct a directive per the procedures defined in the core RIPP
specification. It MUST then generate an "attaching" event to all
listeners, and MUST include the handler URI. It MUST also generate
the "answered" event.
</t>
<t>When a client that had sent an "answered" event receives a
"handler-invalid" event including the URI it had generated, it
SHOULD refresh its list of handlers, inform the user of the answer
failure, and request the user to try again. If the client receives an
"attaching" event, and the handler specified matches its own, it
SHOULD query the call URI to obtain the directive [[OPEN ISSUE: Not
sure I like this, vs. having the client do a POST and have the
server construct the directive then. Thinking.]] It
MUST initiate signaling and media byways for the call, render
incoming media and generate outgoing media for the call. Once the
client has sent and received at least one media packet, indicating a
successful connection, it MUST generate a "handler-connected" event,
including the URI of its handler. As will all call events, this is
sent to all listeners. This allows for a reliable handshake for
answering calls elsewhere.
</t>
<t>Similarly, a client MAY, when creating a call, specify a handler
in the POST to create the call which is different than itself. When
it does that, it follows the procedures defined in the previous
paragraph as well.</t>
<t>A server MAY send, at its own discretion, a "reconnecting" event
if it believes the remote side has lost connectivity. It is
RECOMMENDED that the server send this event once it has ceased to
receive inbound media packets for five seconds. Similarly, a server
MUST send a subsequent "reconnected" event once media begins to flow
once more. If a server receives a new GET request against the
/events resource on the call, it MUST include the "reconnecting"
event if it had sent one recently but not yet sent the "reconnected
event. This allows a client to obtain the full state of the call
when retrieving events.
</t>
</section>
<section title="Syntax">
<t>This specification outlines the syntax for the new events and TG
description. </t>
</section>
<section title="IANA Considerations">
<t>No values are assigned in this document, no registries are created,
and there is no action assigned to the IANA by this document. </t>
</section>
<section title="Security Considerations">
<t>This document introduces no new security considerations. It is a
process document about changes to the rules for certain corner
cases in publishing IETF stream RFCs.</t>
</section>
</middle>
<back>
<references title="Informative References">
<?rfc include="reference.RFC.2119"?>
<?rfc include="reference.RFC.8174"?>
<?rfc include="reference.RFC.3261"?>
</references>
</back>
</rfc>