XMPP Core [1] defines the fundamental streaming XML technology used by XMPP (i.e., stream establishment and termination including authentication and encryption). However, the core XMPP specification does not provide tools for actively managing a live XML stream.
The basic concept behind stream management is that the initiating entity (either a client or a server) and the receiving entity (a server) can exchange "commands" for active management of the stream. The following stream management features are of particular interest because they are expected to improve network reliability and the end-user experience:
Stream management implements these features using short XML elements at the root stream level. These elements are not "stanzas" in the XMPP sense (i.e., not <iq/>, <message/>, or <presence/> stanzas as defined in RFC 6120) and are not counted or acked in stream management, since they exist for the purpose of managing stanzas themselves.
Stream management is used at the level of an XML stream. To check TCP connectivity underneath a given stream, it is RECOMMENDED to use whitespace keepalives (see RFC 6120), XMPP Ping (XEP-0199) [2], or TCP keepalives. By contrast with stream management, Advanced Message Processing (XEP-0079) [3] and Message Delivery Receipts (XEP-0184) [4] define acks that are sent end-to-end over multiple streams; these facilities are useful in special scenarios but are unnecessary for checking of a direct stream between two XMPP entities.
Note: Stream Management can be used for server-to-server streams as well as for client-to-server streams. However, for convenience this specification discusses client-to-server streams only. The same principles apply to server-to-server streams.
The server returns a stream header to the client along with stream features, where the features include an <sm/> element qualified by the 'urn:xmpp:sm:3' namespace (see Namespace Versioning regarding the possibility of incrementing the version number).
Note: The client cannot negotiate stream management until it has authenticated with the server and has bound a resource; see below for specific restrictions.
To enable use of stream management, the client sends an <enable/> command to the server.
If the client wants to be allowed to resume the stream, it includes a boolean 'resume' attribute, which defaults to false [5]. For information about resuming a previous session, see the Resumption section of this document.
The <enable/> element MAY include a 'max' attribute to specify the client's preferred maximum resumption time in seconds.
Upon receiving the enable request, the server MUST reply with an <enabled/> element or a <failed/> element qualified by the 'urn:xmpp:sm:3' namespace. The <failed/> element indicates that there was a problem establishing the stream management "session". The <enabled/> element indicates successful establishment of the stream management session.
The parties can then the use stream management features defined below.
If the server allows session resumption, it MUST include a 'resume' attribute set to a value of "true" or "1" [5].
The <enabled/> element MAY include a 'max' attribute to specify the server's preferred maximum resumption time.
The <enabled/> element MAY include a 'location' attribute to specify the server's preferred IP address or hostname (optionally with a port) for reconnection, in the form specified in Section 4.9.3.19 of RFC 6120 (i.e., "domainpart:port", where IPv6 addresses are enclosed in square brackets "[...]" as described in RFC 5952 [6]); if reconnection to that location fails, the standard XMPP connection algorithm specified in RFC 6120 applies.
The client MUST NOT attempt to negotiate stream management until it is authenticated; i.e., it MUST NOT send an <enable/> element until after authentication (such as SASL, Non-SASL Authentication (XEP-0078) [7] or Server Dialback (XEP-0220) [8]) has been completed successfully.
For client-to-server connections, the client MUST NOT attempt to enable stream management until after it has completed Resource Binding unless it is resuming a previous session (see Resumption).
The server SHALL enforce this order and return a <failed/> element in response if the order is violated (see Error Handling).
Note that a client SHALL only make at most one attempt to enable stream management. If a server receives a second <enable/> element it SHOULD respond with a stream error, thus terminating the client connection.
After enabling stream management, the client or server can send ack elements at any time over the stream. An ack element is one of the following:
The following attribute is defined:
An <a/> element MUST possess an 'h' attribute.
The <r/> element has no defined attributes.
Definition: Acknowledging a previously-received ack element indicates that the stanza(s) sent since then have been "handled" by the server. By "handled" we mean that the server has accepted responsibility for a stanza or stanzas (e.g., to process the stanza(s) directly, deliver the stanza(s) to a local entity such as another connected client on the same server, or route the stanza(s) to a remote entity at a different server); until a stanza has been affirmed as handled by the server, that stanza is the responsibility of the sender (e.g., to resend it or generate an error if it is never affirmed as handled by the server).
Receipt of an <r/> element does not imply that new stanzas have been transmitted by the peer; receipt of an <a/> element only indicates that new stanzas have been processed if the 'h' attribute has been incremented.
The value of 'h' starts at zero at the point stream management is enabled or requested to be enabled (see note below). The value of 'h' is then incremented to one for the first stanza handled and incremented by one again with each subsequent stanza handled. In the unlikely case that the number of stanzas handled during a stream management session exceeds the number of digits that can be represented by the unsignedInt datatype as specified in XML Schema Part 2 [9] (i.e., 232), the value of 'h' SHALL be reset from 232-1 back to zero (rather than being incremented to 232).
Note: Each entity maintains two counters for any given stream: a counter of stanzas it has sent, and a counter of stanzas it has received and handled ('h'). The counter for an entity's own sent stanzas is set to zero and started after sending either <enable/> or <enabled/>. The counter for the received stanzas ('h') is set to zero and started after receiving either <enable/> or <enabled/>.
The following annotated example shows a message sent by the client, a request for acknowledgement, and an ack of the stanza.
When an <r/> element ("request") is received, the recipient MUST acknowledge it by sending an <a/> element to the sender containing a value of 'h' that is equal to the number of stanzas handled by the recipient of the <r/> element. The response SHOULD be sent as soon as possible after receiving the <r/> element, and MUST NOT be withheld for any condition other than a timeout. For example, a client with a slow connection might want to collect many stanzas over a period of time before acking, and a server might want to throttle incoming stanzas. The sender does not need to wait for an ack to continue sending stanzas.
Either party MAY send an <a/> element at any time (e.g., after it has received a certain number of stanzas, or after a certain period of time), even if it has not received an <r/> element from the other party. It is RECOMMENDED that initiating entities (usually clients) send an <a/> element right before they gracefully close the stream, in order to inform the peer about received stanzas. Otherwise it can happen that stanzas are re-sent (usually by the server) although they were actually received.
When a party receives an <a/> element, it SHOULD keep a record of the 'h' value returned as the sequence number of the last handled outbound stanza for the current stream (and discard the previous value).
If a stream ends and it is not resumed within the time specified in the original <enabled/> element, the sequence number and any associated state MAY be discarded by both parties. Before the session state is discarded, implementations SHOULD take alternative action regarding any unhandled stanzas (i.e., stanzas sent after the most recent 'h' value received):
Because unacknowledged stanzas might have been received by the other party, resending them might result in duplicates; there is no way to prevent such a result in this protocol, although use of the XMPP 'id' attribute on all stanzas can at least assist the intended recipients in weeding out duplicate stanzas.
It can happen that an XML stream is terminated unexpectedly (e.g., because of network outages). In this case, it is desirable to quickly resume the former stream rather than complete the tedious process of stream establishment, roster retrieval, and presence broadcast.
In addition, this protocol exchanges the sequence numbers of the last received stanzas on the previous connection, allowing entities to establish definitively which stanzas require retransmission and which do not, eliminating duplication through replay.
To request that the stream will be resumable, when enabling stream management the client MUST add a 'resume' attribute to the <enable/> element with a value of "true" or "1" [5].
If the server will allow the stream to be resumed, it MUST include a 'resume' attribute set to "true" or "1" on the <enabled/> element and MUST include an 'id' attribute that specifies an identifier for the stream.
Definition: The 'id' attribute defines a unique identifier for purposes of stream management (an "SM-ID"). The SM-ID MUST be generated by the server. The client MUST consider the SM-ID to be opaque and therefore MUST NOT assign any semantic meaning to the SM-ID. The server MAY encode any information it deems useful into the SM-ID, such as the full JID <localpart@domain.tld/resource> of a connected client (e.g., the full JID plus a nonce value). Any characters allowed in an XML attribute are allowed. The SM-ID MUST NOT be reused for simultaneous or subsequent sessions (but the server need not ensure that SM-IDs are unique for all time, only for as long as the server is continuously running). The SM-ID SHOULD NOT be longer than 4000 bytes.
As noted, the <enabled/> element MAY include a 'location' attribute that specifies the server's preferred location for reconnecting (e.g., a particular connection manager that hold session state for the connected client).
If the stream is terminated unexpectedly, the client would then open a TCP connection to the server. The order of events is as follows:
Note: The order of events might differ from those shown above, depending on when the server offers the SM feature, whether the client chooses STARTTLS, etc. Furthermore, in practice server-to-server streams often do not complete SASL negotiation or even TLS negotiation. The foregoing text does not modify any rules about the stream negotiation process specified in RFC 6120. However, since stream management applies to the exchange of stanzas (not any other XML elements), it makes sense for the server to offer the SM feature when it will be possible for the other party to start sending stanzas, not before. See also Recommended Order of Stream Feature Negotiation (XEP-0170) [13].
To request resumption of the former stream, the client sends a <resume/> element qualified by the 'urn:xmpp:sm:3' namespace. The <resume/> element MUST include a 'previd' attribute whose value is the SM-ID of the former stream and MUST include an 'h' attribute that identifies the sequence number of the last handled stanza sent over the former stream from the server to the client (in the unlikely case that the client never received any stanzas, it would set 'h' to zero).
If the server can resume the former stream, it MUST return a <resumed/> element, which MUST include a 'previd' attribute set to the SM-ID of the former stream and MUST also include an 'h' attribute set to the sequence number of the last handled stanza sent over the former stream from the client to the server (in the unlikely case that the server never received any stanzas, it would set 'h' to zero).
If the server does not support session resumption, it MUST return a <failed/> element, which SHOULD include an error condition of <feature-not-implemented/>. If the server does not recognize the 'previd' as an earlier session (e.g., because the former session has timed out), it MUST return a <failed/> element, which SHOULD include an error condition of <item-not-found/>. If the server recogizes the 'previd' as an earlier session that has timed out the server MAY also include a 'h' attribute indicating the number of stanzas received before the timeout. (Note: For this to work the server has to store the SM-ID/sequence number tuple past the time out of the actual session.)
In both of these failure cases, the server SHOULD allow the client to bind a resource at this point rather than forcing the client to restart the stream negotiation process and re-authenticate.
If the former stream is resumed and the server still has the stream for the previously-identified session open at this time, the server SHOULD send a 'conflict' stream error and close that stream.
When a session is resumed, the parties proceed as follows:
If an error occurs with regard to an <enable/> or <resume/> element, the server MUST return a <failed/> element. This element SHOULD contain an error condition, which MUST be one of the stanza error conditions defined in RFC 6120.
An example follows.
Stream management errors SHOULD be considered recoverable; however, misuse of stream management MAY result in termination of the stream.
When a remote entity acknowledges that it has handled a number of stanzas that is higher than the amount of stanzas that it was sent (by sending an 'h' value that is too high), the local entity SHOULD generate an undefined-condition stream error that includes a <handled-count-too-high/> element, and close the stream:
A cleanly closed stream differs from an unfinished stream. If a client wishes to cleanly close its stream and end its session, it MUST send a </stream:stream> so that the server can send unavailable presence on the client's behalf.
If the stream is not cleanly closed then the server SHOULD consider the stream to be unfinished (even if the client closes its TCP connection to the server) and SHOULD maintain the session on behalf of the client for a limited amount of time. The client can send whatever presence it wishes before leaving the stream in an unfinished state.
The following scenarios illustrate several different uses of stream management. The examples are that of a client and a server, but stream management can also be used for server-to-server streams.
The Stream Management protocol can be used to improve reliability using acks without the ability to resume a session. A basic implementation would do the following:
This is enough of an implementation to minimally satisfy the peer, and allows basic tracking of each outbound stanza. If the stream connection is broken, the application has a queue of unacknowledged stanzas that it can choose to handle appropriately (e.g., warn a human user or silently send after reconnecting).
The following examples illustrate basic acking (here the client automatically acks each stanza it has received from the server, without first being prompted via an <r/> element).
First, after authentication and resource binding, the client enables stream management.
The server then enables stream management.
The client then retrieves its roster and immediately sends an <r/> element to request acknowledgement.
The server handles the client stanza (here returning the roster) and sends an <a/> element to acknowledge handling of the stanza.
The client then chooses to acknowledge receipt of the server's stanza (although here it is under no obligation to do so, since the server has not requested an ack), sends initial presence, and immediately sends an <r/> element to request acknowledgement, incrementing by one its internal representation of how many stanzas have been handled by the server.
The server immediately sends an <a/> element to acknowledge handling of the stanza and then broadcasts the user's presence (including to the client itself as shown below).
The client then acks the server's second stanza and sends an outbound message followed by an <r/> element.
The server immediately sends an <a/> element to acknowledge handling of the third client stanza and then routes the stanza to the remote contact (not shown here because the server does not send a stanza to the client).
And so on.
The basic acking scenario is wasteful because the client requested an ack for each stanza. A more efficient approach is to periodically request acks (e.g., every 5 stanzas). This is shown schematically in the following pseudo-XML.
In particular, on mobile networks, it is advisable to only request and/or send acknowledgements when an entity has other data to send, or in lieu of a whitespace keepalive or XMPP ping (XEP-0199).
As noted, a server MUST NOT allow a client to resume a stream management session until after the client has authenticated (for some value of "authentication"); this helps to prevent session hijacking.
This XEP requires no interaction with the Internet Assigned Numbers Authority (IANA) [15].
This specification defines the following XML namespace:
The XMPP Registrar [16] includes the foregoing namespace in its registry at <https://xmpp.org/registrar/namespaces.html>, as described in Section 4 of XMPP Registrar Function (XEP-0053) [17].
If the protocol defined in this specification undergoes a revision that is not fully backwards-compatible with an older version, the XMPP Registrar shall increment the protocol version number found at the end of the XML namespaces defined herein, as described in Section 4 of XEP-0053.
The XMPP Registrar includes 'urn:xmpp:sm:3' in its registry of stream features at <https://xmpp.org/registrar/stream-features.html>.
Thanks to Bruce Campbell, Jack Erwin, Philipp Hancke, Curtis King, Tobias Markmann, Alexey Melnikov, Pedro Melo, Robin Redeker, Mickaël Rémond, Florian Schmaus, and Tomasz Sterna for their feedback.
This document in other formats: XML PDF
This XMPP Extension Protocol is copyright © 1999 – 2020 by the XMPP Standards Foundation (XSF).
Permission is hereby granted, free of charge, to any person obtaining a copy of this specification (the "Specification"), to make use of the Specification without restriction, including without limitation the rights to implement the Specification in a software program, deploy the Specification in a network service, and copy, modify, merge, publish, translate, distribute, sublicense, or sell copies of the Specification, and to permit persons to whom the Specification is furnished to do so, subject to the condition that the foregoing copyright notice and this permission notice shall be included in all copies or substantial portions of the Specification. Unless separate permission is granted, modified works that are redistributed shall not contain misleading information regarding the authors, title, number, or publisher of the Specification, and shall not claim endorsement of the modified works by the authors, any organization or project to which the authors belong, or the XMPP Standards Foundation.
## NOTE WELL: This Specification is provided on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. ##
In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall the XMPP Standards Foundation or any author of this Specification be liable for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising from, out of, or in connection with the Specification or the implementation, deployment, or other use of the Specification (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if the XMPP Standards Foundation or such author has been advised of the possibility of such damages.
This XMPP Extension Protocol has been contributed in full conformance with the XSF's Intellectual Property Rights Policy (a copy of which can be found at <https://xmpp.org/about/xsf/ipr-policy> or obtained by writing to XMPP Standards Foundation, P.O. Box 787, Parker, CO 80134 USA).
The HTML representation (you are looking at) is maintained by the XSF. It is based on the YAML CSS Framework, which is licensed under the terms of the CC-BY-SA 2.0 license.
The Extensible Messaging and Presence Protocol (XMPP) is defined in the XMPP Core (RFC 6120) and XMPP IM (RFC 6121) specifications contributed by the XMPP Standards Foundation to the Internet Standards Process, which is managed by the Internet Engineering Task Force in accordance with RFC 2026. Any protocol defined in this document has been developed outside the Internet Standards Process and is to be understood as an extension to XMPP rather than as an evolution, development, or modification of XMPP itself.
The primary venue for discussion of XMPP Extension Protocols is the <standards@xmpp.org> discussion list.
Discussion on other xmpp.org discussion lists might also be appropriate; see <http://xmpp.org/about/discuss.shtml> for a complete list.
Errata can be sent to <editor@xmpp.org>.
The following requirements keywords as used in this document are to be interpreted as described in RFC 2119: "MUST", "SHALL", "REQUIRED"; "MUST NOT", "SHALL NOT"; "SHOULD", "RECOMMENDED"; "SHOULD NOT", "NOT RECOMMENDED"; "MAY", "OPTIONAL".
1. RFC 6120: Extensible Messaging and Presence Protocol (XMPP): Core <http://tools.ietf.org/html/rfc6120>.
2. XEP-0199: XMPP Ping <https://xmpp.org/extensions/xep-0199.html>.
3. XEP-0079: Advanced Message Processing <https://xmpp.org/extensions/xep-0079.html>.
4. XEP-0184: Message Delivery Receipts <https://xmpp.org/extensions/xep-0184.html>.
5. In accordance with Section 3.2.2.1 of XML Schema Part 2: Datatypes, the allowable lexical representations for the xs:boolean datatype are the strings "0" and "false" for the concept 'false' and the strings "1" and "true" for the concept 'true'; implementations MUST support both styles of lexical representation.
6. RFC 5952: A Recommendation for IPv6 Address Text Representation <http://tools.ietf.org/html/rfc5952>.
7. XEP-0078: Non-SASL Authentication <https://xmpp.org/extensions/xep-0078.html>.
8. XEP-0220: Server Dialback <https://xmpp.org/extensions/xep-0220.html>.
9. XML Schema Part 2: Datatypes <http://www.w3.org/TR/xmlschema11-2/>.
10. XEP-0203: Delayed Delivery <https://xmpp.org/extensions/xep-0203.html>.
11. RFC 5077: Transport Layer Security (TLS) Session Resumption without Server-Side State <http://tools.ietf.org/html/rfc5077>.
12. On the use of TLS Session resumption and SASL EXTERNAL <http://tools.ietf.org/html/draft-cridland-sasl-tls-sessions>. Work in progress.
13. XEP-0170: Recommended Order of Stream Feature Negotiation <https://xmpp.org/extensions/xep-0170.html>.
14. XEP-0030: Service Discovery <https://xmpp.org/extensions/xep-0030.html>.
15. The Internet Assigned Numbers Authority (IANA) is the central coordinator for the assignment of unique parameter values for Internet protocols, such as port numbers and URI schemes. For further information, see <http://www.iana.org/>.
16. The XMPP Registrar maintains a list of reserved protocol namespaces as well as registries of parameters used in the context of XMPP extension protocols approved by the XMPP Standards Foundation. For further information, see <https://xmpp.org/registrar/>.
17. XEP-0053: XMPP Registrar Function <https://xmpp.org/extensions/xep-0053.html>.
Note: Older versions of this specification might be available at http://xmpp.org/extensions/attic/
Specify error conditions.
Mark 'h' element as xs:unsignedInt in the schema too, it is already specified to be a 32-bit wrapping unsigned integer in the text.
Improve the note about stream management counters in section 4.
Send 'a' element before stream closure.
Fix example syntax highlighting and formatting.
Simplification based on implementation experience: removed acking per number of stanzas exchanged because either entity can request an ack at any time; moved throttling feature to a separate specification; removed 'stanzas' attribute from <enable/> element; added 'location' attribute to <enabled/> element; clarified several implementation issues in the text; fixed several examples; versioned the XML namespace from urn:xmpp:sm:2 to urn:xmpp:sm:3.
Corrected value of 'h' so that zero means no stanzas have yet been handled; clarified distinction between a cleanly closed stream and an unfinished stream.
Per a vote of the XMPP Council, advanced specification from Experimental to Draft.
Editorial review.
Removed pings (use XEP-0199, whitespace pings, or TCP keepalives instead); removed section on throttling, since it is unworkable.
Removed recommendation to use namespace prefixes; modified namespace to incorporate namespace versioning.
Added support for session resumption; re-organized the document; changed name to stream management; changed provisional namespace.
Updates per devcon discussion.
Require c attribute on <r/> element. Describe minimal implementation. Switch to standard temporary namespace.
Initial published version.
END