HK1181579A

HK1181579A - Distributing multi-source push notifications to multiple targets

Info

Publication number: HK1181579A
Application number: HK13108803.5A
Authority: HK
Inventors: C．F．瓦斯特斯
Original assignee: Microsoft Technology Licensing, Llc
Priority date: 2011-09-12
Filing date: 2013-07-26
Publication date: 2013-11-08

Description

Distributing multi-source push notifications to multiple targets

Technical Field

The present invention relates to push notifications, and more particularly to distributing multi-source push notifications to multiple targets.

Background

Background and related Art

Computers and computing systems have affected almost every aspect of modern life. Computers are commonly involved in work, leisure, health care, transportation, entertainment, home administration, and the like.

Furthermore, computing system functionality may also be enhanced by the computing system's ability to interconnect to other computing systems via network connections. The network connection may include, but is not limited to, a connection via a wired or wireless ethernet, a cellular connection, or even a computer-to-computer connection through a serial, parallel, USB, or other connection. These connections allow the computing system to access services on other computing systems and quickly and efficiently receive application data from the other computing systems.

Developers can be in iOS, Android, Windows mobile phones (Android) ((R))Phone), a windowEtc. that focus on delivering news of general interest, information and facts about world events or football, rugby, hockey, baseball leagues or team fans to keep them up-to-date. For any of these applications (and a wide variety of other applications), it is a great difference to pop up a reminder or message (toast) when a fan's favorite team scores or when a certain type of news event breaks out in the world. The dissimilator typically builds and runs a server infrastructure to push these events into a notification channel provided by the operating system platform or device vendor, beyond the skill set of many mobile application ("app") developers who focus on optimizing the user experience. And if their application is very successful, simple server-based solutions will quickly reach the scalability ceiling, since it is very challenging to distribute events to tens or even hundreds of thousands or millions of devices in a timely manner.

In addition, a large number of contemporary mobile applications are written as simple experiences over existing internet assets. For example, a news application may display the latest headlines from RSS feeds of a primary news provider immediately upon the user opening the application without having to navigate to a website. Independent software developers and small independent software vendors are building large numbers of such applications and are selling these applications at very low price points. For applications that would also benefit greatly from push notifications, not only the distribution of events, but also the acquisition of event data, as acquisition would also require the construction and running of an unusual server infrastructure.

The subject matter claimed herein is not limited to embodiments that solve any disadvantages or that operate only in environments such as those described above. Rather, this background is provided merely to illustrate one exemplary technology area in which some embodiments described herein may be practiced.

Disclosure of Invention

One embodiment illustrated herein includes a method of delivering an event to a consumer. The method includes accessing private data. The method further comprises the following steps: the private data is normalized to create normalized events. A plurality of end consumers that will receive the event based on the subscription is determined. The data from the normalized events is formatted into a plurality of different formats that are individually appropriate for each determined end consumer. Delivering data from the normalized event to each of the plurality of end consumers in a format that: the format is appropriate for each end user and conforms to the protocol rules defined by the target infrastructure through which the consumer is reached.

The summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.

Additional features and advantages will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the teachings herein. The features and advantages of the invention may be realized and obtained by means of the instruments and combinations particularly pointed out in the appended claims. Features of the present invention will become more fully apparent from the following description and appended claims, or may be learned by the practice of the invention as set forth hereinafter.

Drawings

In order to describe the manner in which the above-recited and other advantages and features of the present subject matter can be obtained, a more particular description of the subject matter briefly described above will be rendered by reference to specific embodiments thereof which are illustrated in the appended drawings. Understanding that these drawings depict only typical embodiments and are not therefore to be considered to be limiting of its scope, the embodiments will be described and explained with additional specificity and detail through the use of the accompanying drawings in which:

FIG. 1 shows an overview of a system for collecting event data, mapping event data to generic events, and distributing event data to individual target consumers.

FIG. 2 illustrates an event data acquisition and distribution system;

FIG. 3 illustrates an example of an event data acquisition system;

FIG. 4 illustrates an example of an event data distribution system;

FIG. 5 illustrates an event data acquisition and distribution system;

FIG. 6 illustrates an embodiment of a badge counter function; and

FIG. 7 illustrates a method of delivering an event to a consumer.

Detailed Description

Embodiments may combine an event acquisition system with a notification distribution system and a mapping model for mapping events to notifications. Embodiments can also filter notifications based on criteria provided by the subscriber. Additionally, embodiments may have depth capabilities, such as tracking delivery counts for individual targets in an efficient manner.

An example of this is shown in figure 1. FIG. 1 shows an example where information from a large number of different sources 116 is delivered to a large number of different targets 102. In some examples, information from a single source or aggregated information from multiple sources 116 may be used to create a single event that is delivered to a large number of targets 102. Note that the indicator 102 may be used to refer to all targets collectively, or to individual targets in general. The specific respective target may be indicated by a further distinguisher.

Fig. 1 shows a source 116. Note that indicator 116 may be used to refer to all sources collectively, or to individual sources in general. In particular toMay be indicated by a further specifier. Sources 116 may include, for example, a wide variety of public and private network services, including: RSS, Atom and OData feeds; email mailboxes, including but not limited to such supporting IMAP and POP3 protocols; social network information sources 116, such as Twitter (Twitter) timeline or Facebook (Facebook) walls; and subscriptions on external publish/subscribe infrastructures, such as Windows Azure Service^TMBus (Windows Azure Service Bus) or Amazon's Simple Queue Service (Amazon Simple Queue Service).

The source 116 may be used to obtain event data. As will be explained in more detail below, the sources 116 may be organized into acquisition topics, such as acquisition topic 140-1. The event data may be mapped to normalized events, shown generally at 104. The normalized events 104 may be mapped to notifications for a particular target 102 by one or more mapping modules 130. The notification 132 represents a notification for the target 102. It should be appreciated that a single event 104 may be mapped to a plurality of different notifications, wherein the different notifications are in different formats suitable for distribution to a plurality of different targets 102. For example, FIG. 1A shows a target 102. The target 102 supports a number of different message formats depending on the characteristics of the target. For example, some targets 102 may support notifications in relay format, and other targets 102 may support notifications forMPNS of 7 telephones (Push notification service), other targets 102 may support notifications in APN (Apple push notification) format for iOS devices, other targets 102 may support notifications in C2DM (cloud-to-device messaging) format for Android devices, other targets 102 may support JSON (Java script object notation) for browsers on devices, other targets 102 may support notifications in HTTP (hypertext transfer protocol), and so forth.

Thus, the mapping by the mapping module 130 may map a single event 104 created from information from one or more data sources 116 to multiple different notifications for different targets 102. Different notifications 132 may then be delivered to the various targets 102.

In some embodiments, this may be accomplished using a fan-out (fan-out) topology as shown in FIG. 2. Fig. 2 shows a source 116. As will be discussed later herein, embodiments may utilize the acquisition partition 140. Each of the fetch partitions 140 may include multiple sources 116. There may be a large and varied number of sources 116. The source 116 provides information. Such information may include, for example, but is not limited to, emails, text messages, real-time stock quotes, real-time event scores, news updates, and so forth.

FIG. 2 shows that each partition includes a fetch engine, such as illustrative fetch engine 118. The fetch engine 118 collects information from the sources 116 and generates events based on the information. In the example shown in fig. 2, a plurality of events are shown as being generated by the acquisition engine using various sources. The description is made using event 104-1. In some embodiments, the event 104-1 may be normalized as explained below. The acquisition engine 118 may be a service on a network, such as the internet, that collects information from the sources 116 on the network.

FIG. 2 shows the event 104-1 being sent to the distribution topic 144. The distribution topic 144 fans out events to multiple distribution partitions. Distribution partition 120-1 is used as an analog of all distribution partitions. The distribution partitions each serve a plurality of end users or devices represented by the subscriptions. The number of subscriptions served by a distribution partition may be different from the number served by other distribution partitions. In some embodiments, the number of subscriptions serviced by a partition may depend on the ability to distribute the partition. Alternatively or additionally, the distribution partition may be selected to service the user based on logical or geographic proximity to the end user. This may allow the reminder to be delivered to the end user in a more timely manner.

In the illustrated example, distribution partition 120-1 includes a distribution engine 122-1. The distribution engine 122-1 consults the database 124-1. Database 124-1 includes information about subscriptions with details about associated delivery targets 102. In particular, the database may include information such as information describing the platform of the target 102, the application used by the target 102, the network address of the target 102, user preferences of the end user using the target 102, and so forth. Using the information in database 124-1, distribution engine 122-1 constructs bundle 126-1, where bundle 126-1 includes event 104 (or at least information from event 104) and a routing list 128-1 that identifies a plurality of targets 102 from targets 102 to which information from event 104-1 is to be sent as a notification. The bundle 126-1 is then placed in the queue 130-1.

The distribution partition 120-1 may include multiple delivery engines. The delivery engine dequeues the respective bundles from queue 130-1 and delivers the notifications to target 102. For example, delivery engine 108-1 may retrieve bundle 126-1 from queue 130-1 and send event 104 information to targets 102 identified in route list 128-1. Thus, notifications 134 including event 104-1 information can be sent from various distribution partitions to target 202 in a variety of different formats that are applicable to different targets 102 and specific to individual targets 102. This allows for the creation of individualized notifications 134 from a common event 104-1 at the edge of the delivery system that are individualized for each target 102, rather than the shipping of a large number of individualized notifications through the delivery system.

An alternative description of an information collection and event distribution system that may be used in some embodiments is shown below.

As a basis, one embodiment system is using a publish/subscribe infrastructure provided by Windows Azure Service Bus available from microsoft corporation of redmond, washington, but the infrastructure may also exist in a similar fashion in various other messaging systems. This infrastructure provides two capabilities that facilitate the implementation of the presented method: topics and queues

A queue is a storage structure for messages that allows messages to be added (enqueued) in sequential order and removed (dequeued) in the same order as the messages were added. Messages may be added and removed by any number of concurrent clients, allowing the load on the enqueue side to be leveled and the processing load to be balanced across the various recipients on the dequeue side. The queue also allows entities to acquire a lock on a message when dequeuing the message, allowing the consuming client to explicitly control: when a message is actually deleted from the queue, or whether it can be restored back into the queue in the event of a failure to process the retrieved message.

A topic is a storage structure that has all the characteristics of a queue, but allows multiple, concurrently existing 'subscriptions', each of which allows an isolated, filtered view of a sequence of enqueue messages. Each subscription on the topic produces a copy of each enqueue message, assuming that the subscription's associated filter criteria positively matches the message. Thus, enqueuing to a message with a topic of 10 subscriptions (where each subscription has a simple 'pass through' condition that matches all messages) will result in a total of 10 messages, one for each subscription. Like a queue, a subscription may have multiple concurrent consumers, providing a balance of processing load across multiple recipients.

Another basic concept is an 'event', which is simply a message in terms of the underlying publish/subscribe infrastructure. In the context of one embodiment, events are subject to a set of simple constraints governing the use of message bodies and message attributes. The message body of an event typically flows as an opaque block of data, and any event data considered by one embodiment typically flows in message attributes, which are a set of key/value pairs that are part of the message representing the event.

Referring now to FIG. 3, the goal of an embodiment infrastructure is to obtain event data from a wide variety of different sources 116 on a large scale and forward these events to the publish/subscribe infrastructure for further processing. Processing may include some form of analysis, real-time searching, or redistribution of events to interested subscribers through pull or push notification mechanisms.

One embodiment infrastructure defines an acquisition engine 118, a model for acquisition adapter and event normalization, a partition store 138 to maintain metadata about acquisition sources 116, a common partition and scheduling model, and a model for how user-initiated changes to acquisition sources 116 state are streamed into the system at runtime without further database lookups.

In particular implementations, the acquisition may support specific acquisition adapters to acquire events from a wide variety of public and private networking services, including RSS, Atom, and OData feeds, including but not limited to such email mailboxes supporting IMAP and POP3 protocols, social networking information sources 116 like Twitter timeline or Facebook walls, and subscriptions to external publish/subscribe infrastructure like Windows Azure Service Bus or Amazon's Simple Queue Service.

Event normalization

The event data is normalized so that the event can be actually consumed by the subscriber on the publish/subscribe infrastructure to which the event is handed over. Normalization in this context means that events are mapped onto a common event model with consistent representations of information items that may be of interest to a large number of subscribers in the various contexts. The model selected here is a simple representation of events in the form of a flat list of key/value pairs that can be accompanied by a single opaque binary data block that the system does not interpret further. This event representation is easily representable on most publish/subscribe infrastructures and also maps very clearly to common internet protocols such as HTTP.

To illustrate event normalization, consider the mapping of RSS or Atom feed entries to events 104 (see FIGS. 1 and 2). RSS and Atom are two internet standards that are generally very widely used to distribute news and other current information in chronological order and to help make the information available for processing in a structured manner in computer programs. RSS and Atom share a very similar structure and a set of data elements that are named differently but are semantically identical. The first normalization step is therefore to define a common name, like title or synopsis, for such semantically identical elements defined in both standards as keys. Second, data that only appears in one standard but not in the other is typically mapped with a corresponding "native" name. In addition to this, these kinds of feeds are often provided with "extensions", which are data items that are not defined in the core standard, but add additional data using extensibility tools in the respective standard.

Some of these extensions (including but not limited to GeoRSS for geo-location or OData that embeds structured data into Atom feeds) are mapped in a common way shared across different event sources 116 so that subscribers on the publish/subscribe infrastructure to which events are transmitted can interpret geo-location information in a uniform way, regardless of whether data has been acquired from RSS or Atom or Twitter timelines. Continuing with the GeoRSS example, a simple GeoRSS expression representing a geographical "point" may thus be mapped to a pair of numeric "latitude"/"longitude" attributes representing the coordinates of WGS 84.

Extensions with complex structured data (such as OData) can implement mapping models that preserve complex types of structure and data but do not complicate the underlying event model. Some embodiments normalize to a regular compact complex data (like JSON) representation and map complex data attributes (e.g., the OData attribute "tenant" for a complex data type "person") to key/value pairs, where a key is the attribute name "tenant" and the value is the complex data that describes the person in terms of name, biometric information, and address information represented in JSON serialized form. If the data source is an XML document, as it is in the case of RSS or Atom, a value can be created by transcribing XML data into JSON that preserves the structure provided by XML, but flattening the XML properties (like attributes and elements), meaning that both the XML attributes and elements of the subordinates of the same XML element node are mapped to JSON attributes as "siblings" without further distinction.

Source and partition

One embodiment infrastructure captures metadata about the data source 116 in a "source description" record, which may be stored in the source database 138. A "source description" may have a set of common elements and a set of elements that are specific to the data source. The common elements may include the name of the source, the time span interval during which the source 116 is considered valid, a human-readable description, and the type of source 116 to distinguish. The source-specific elements depend on the type of source 116 and may include a network address, credentials, or other security-critical material to gain access to the resource represented by the address, and metadata that directs the source acquisition adapter to perform data acquisition in a particular manner, like providing a time interval for checking RSS feeds, or to perform time forwarding in a particular manner, such as spacing events acquired from current event news feeds at least 60 seconds apart, so that if this is an end-to-end experience to be built, the recipient is notified of the opportunity to view each explosive news item on a limited screen surface.

The source description is maintained in one or more stores, such as source database 138. The source description may be partitioned within these stores across and within these stores along two different axes.

The first axis is the system tenant differentiation. A system tenant, or "namespace," is a mechanism that creates isolation scopes for entities within a system. Showing one specific scenario, if "Fred" is a user of a system implementing one embodiment, Fred will be able to create a tenant scope that provides Fred with an isolated virtual environment that can maintain source descriptions and configurations and states completely independent of other sources 116 in the system. This axis may be used as a differentiating factor to propagate source descriptions across stores, especially in situations where a tenant needs to isolate stored metadata (which may include security sensitive data such as passwords), or for technical, administrative, or business reasons. A system tenant may also represent an affinity to a particular data center in which the source description is maintained and where data acquisition is performed.

The second axis may be a distinction of numeric section identifiers selected from a predefined range of identifiers. Partition identifiers, such as source names and tenant identifiers, may be derived from the invariants contained by the source description. The partition identifier may be derived from these invariants using a hash function (one of many candidates is Jenkins hashing, see http:// www.burtleburtle.net/bob/hash/doobs. html), and the resulting hash value may be computed down to the partition identifier range using a modulo function on the hash value. The range of identifiers is selected to be larger (and may be significantly larger) than the maximum number of storage partitions that are expected to be needed to store all of the source descriptions that were maintained in the system.

Introduction of storage partitioning is typically motivated by a capacity limit that is either immediately related to storage capacity quotas on the underlying data stores or related to a capacity limit that affects the acquisition engine 118 (such as a bandwidth constraint for a given data center or portion of a data center), which may result in embodiments creating acquisition partitions 140 that utilize capacity across different data centers or data center departments to meet inbound bandwidth requirements. The storage partition owns a subset of the entire range of identifiers and can therefore directly infer from its partition identifier the association of the source profile with the storage partition (and the resources needed to access the partition).

In addition to providing a storage partition axis, the partition identifier is also used to schedule or acquire tasks, and to clearly define the ownership of a given source description by the acquiring partition 140 (which may be different from the relationship to the storage partition).

Ownership and acquisition partition

Each source description in the system may be owned by a particular acquisition partition 140. Clear and unique ownership is used because the system does not need events from exactly the same source 116 in multiple locations in parallel, which may result in dual events being transmitted. To make this more specific, an RSS feed defined within a tenant is owned by exactly one acquisition partition 140 in the system, and within a partition, there is one scheduled acquisition running on a particular feed at any given point in time.

The acquiring partition 140 obtains ownership of the source description by obtaining ownership of the partition identifier range. The acquisition partition 140 may be assigned a range of identifiers using an external dedicated partition system that may have failover functionality and may assign a primary/backup owner, or using a simpler mechanism in which the range of partition identifiers is evenly spread among the number of different compute instances that assume the role of an acquisition engine. In a more advanced implementation with an external partition system, if the system starts from a "cold" state, meaning that the partition does not yet have a previous owner, the selected master owner of the partition is responsible for seeding the scheduling of tasks. In a simpler scenario, a compute instance that owns a partition owns a seed to the schedule.

Scheduling

The scheduling requirements of the acquisition tasks depend on the nature of the particular source, but there are generally two types of acquisition models that are implemented in some of the described embodiments.

In the first model, the owner initiates some form of connection or long-running network request on the source's network service and waits for data in the form of datagrams or streams to be returned on that connection. In the case of a long-running request, also commonly referred to as long-term polling, the source network service will hold the request until a timeout occurs or until data becomes available — in turn, the acquisition adapter will wait for the pending or non-load results to complete the request and then resend the request. Thus, the get scheduling model has the form of a "tight" loop that is initialized when the owner of the source 116 knows the source, and where a new request or connection is immediately initialized when the current connection or request is completed or temporarily interrupted. Since the owner directly controls the tight loop, the loop can reliably remain alive while the owner is running. If the owner stops and restarts, the cycle also restarts. If the owner changes, the loop stops and a new owner starts the loop.

In the second model, the source's web service does not support long-running requests or connections that generate data as it becomes available, but is a conventional request/response service that returns immediately whenever a query is made. In these services, this applies to many web resources, requesting data in a continuous tight loop results in a huge amount of load on the source 116, and also results in significant network traffic that either only indicates that the source 116 has not changed, or in the worst case carries the same data over and over. To balance the need for timely event acquisition and not overload the source 116 with invalid query traffic, the acquisition engine 118 will thus execute requests in a "timed" loop, where requests on the source 116 are periodically executed based on intervals that balance those considerations and also take into account hints from the source 116. A "timed" loop is initiated when the owner of the source 116 knows the source.

There are two notable implementation variations of the timing loop. The first variant is for low-scale, best-effort scenarios and uses local, in-memory, timing objects for scheduling, which makes the scale, control, and restart characteristics similar to those of tight loops. The loop is initialized and a timer callback is immediately scheduled that causes the first iteration of the get task to run. When the task completes (even with an error) and it is determined that the loop will continue to execute, another timer callback is scheduled at the instant the task is to be executed next.

A second variant uses 'scheduled messages', which are messages that include Windows Azure^TMOne feature of several publish/subscribe systems for Service Bus. This variant provides a significantly higher acquisition ratio at the expense of a slightly higher complexity. The scheduling loop is initialized by the owner and the message is placed in the scheduling queue of the acquisition partition. The message includes a source description. It is then obtained by executionThe worker taking the task picks up and then enqueues the resulting event into the target publish/subscribe system. Finally, it also enqueues the new "scheduled" message into the scheduling queue. This message is referred to as "scheduled" because it is tagged with the time instant it becomes available for retrieval by any consumer on the scheduling queue.

In this model, the acquisition partition 140 may be extended by an "owner" role that has one primary seeding schedule and may be paired with any number of "worker" roles that perform the actual acquisition task.

Source update

While the system is running, the acquisition partition 140 needs to be able to know the new source 116 to observe and which source 116 will no longer be observed. The decision in this regard is typically at the user and as a result of interaction with management service 142, except for the case where source 116 is blacklisted due to detected unrecoverable or temporary errors (described below). To communicate such changes, the acquisition system maintains a "source update" topic in the underlying publish/subscribe infrastructure. Each fetch partition 140 has a dedicated subscription for that topic with a filter term that constrains appropriate messages to those carrying partition identifiers within the range owned by the fetch partition. This enables the management service 142 to set updates about new or retired sources 116 and send them to the correct partition 140 without knowledge of partition ownership distribution.

The management service 142 submits an update command to a topic that includes a source description, a partition identifier (for filtering purposes as previously described), and an operation identifier that indicates whether the source 116 is to be added or whether the source 116 is to be removed from the system.

Once the command message has been retrieved by the owner of the fetch partition 140, it will either schedule a new fetch cycle for the new source 116, or it will interrupt and suspend, or even retire, the existing fetch cycle.

Are listed in the blacklist

Sources 116 of data acquisition failures may be blacklisted temporarily or permanently. Temporary blacklisting is performed when source 116 network resources are unavailable or return an error that is not directly related to the initiated acquisition request. The duration of the temporary blacklisting depends on the nature of the error. Temporary blacklisting is performed by interrupting a conventional scheduling loop (tight or timed) and scheduling the next iteration of the loop (via a callback or scheduled message) at the time instant when the other party is expected to resolve the error condition.

The permanent blacklisting is performed when it is determined that the error is a direct result of the acquisition request, meaning that the request is causing an authentication or authorization error, or that the remote source 116 indicates some other request error. If the resource is permanently blacklisted, the source 116 is marked as blacklisted in the partition store and the acquisition cycle is immediately aborted. Restoring the permanently blacklisted source 116 requires removing the blacklist tags in the store, possibly with configuration changes that cause the requested behavior change, and restarting the acquisition cycle via the source update topic.

Notification distribution

Embodiments may be configured to distribute a copy of information from a given input event to each of a large number of "targets 102" associated with a particular scope, and to do so in a minimal amount of time for each target 102. The target 102 may include an address of a device or application coupled to an identifier of an adapter to some third party notification system or some network accessible external infrastructure, and assistance data for accessing the notification system or infrastructure.

Some embodiments may include an architecture that is divided into three different processing roles, which are described in detail below and can be understood with reference to FIG. 4. As shown by '1', ellipses, and 'n' in FIG. 4, each of the processing roles can have one or more instances of that processing role. Note that when applied to processing roles, the use of 'n' in each case should be considered different from the other, meaning that each of the processing roles does not necessarily have the same number of instances. The 'distribution engine' 112 role accepts events and binds them to a route list (see, e.g., route list 128-1 in fig. 2) containing various sets of targets 102. The 'delivery engine' 108 accepts these bindings and processes the routing list for delivery to the various network locations represented by the target 102. The 'management role' illustrated by management service 142 provides an external API to management target 102 and is also responsible for accepting statistics and error data from delivery engine 108 and for processing/storing the data.

The data stream is anchored on a 'distribution topic 144' to which events are submitted for distribution. Submitted events are tagged with their associated scope using message attributes, which may be one of the above constraints that distinguish the events from the original message.

In the illustrated example, the distribution topic 144 has one pass-through (unfiltered) subscription for each 'distribution partition 120'. A 'distribution partition' is an isolated set of resources responsible for distributing and delivering notifications to a given range of subsets of targets 102. The copy of each event sent into the distribution topic is available to all concurrently configured distribution partitions virtually simultaneously through their associated subscriptions, allowing parallelization of the distribution work.

The parallelization achieved by partitioning helps achieve timely distribution. To understand this, consider a range with ten million targets 102. If the target's data is held in non-partitioned storage, the system would have to sequentially traverse a single large database result set, or if the result set was obtained using a partitioned query to the same storage, the throughput for obtaining the target data would be at least throttled by the throughput ceiling of the forward (front) network gateway infrastructure of the given storage, and as a result, the delivery latency of notifications to the target 102 whose description records occur very late in the given result set would likely be unsatisfactory.

In contrast, if ten million targets 102 are scattered across 1000 stores, each holding 10000 target records, and these stores are paired with a dedicated computing infrastructure (herein described 'distribution engine 122' and 'delivery engine 108') that performs queries and processes results in partitions as described herein, then the acquisition of the target descriptions can be parallelized across a large set of computing and network resources, significantly reducing the time difference measured from the first to last event distributed when all events are distributed.

The actual number of distribution partitions is not technically limited. It can range from a single partition to any number of partitions greater than one.

In the illustrated example, once the 'distribution engine 122' of the distribution partition 120 obtains the event 104, it first calculates the size of the event data and then calculates the size of the routing list 128, which may be calculated based on the delta between the event size and the smaller of the maximum allowable message size and the upper absolute size limit of the underlying messaging system. The size of the event is limited as follows: there is some minimum headroom to accommodate the 'route list' data.

The routing list 128 is a list that contains the description of the target 102. The routing list is created by the distribution engine 122 by performing a lookup query on the targets 102 held in the partitioned store 124 that matches the scope of the event, thereby returning all targets 102 that match the scope of the event and a set of further conditions selected to narrow down based on the filtering conditions of the event data. Embodiments may include a time window condition that limits the results to targets 102 that are considered valid at the current time, meaning that the current UTC time is within the start/end valid time window contained in the target profile, as well as other filtering conditions. This facility is used for blacklisting, which is described later herein. Upon traversing the lookup results, the engine creates a copy of the event 104, populates the route list 128 to a maximum size with the target description retrieved from the store 124, and then enqueues the resulting event bundle and route list into the partition's ' delivery queue 130 '.

The route list technique ensures that the event flow rate of events from the distribution engine 122 to the delivery engine 108 is higher than the actual message flow rate on the underlying infrastructure, meaning that, for example, in the case where 30 target descriptions can be packed into the route list 128 along with the event data, the event/target pairs flow 30 times as fast as if the event/target pairs were grouped directly into messages.

Delivery engine 108 is a consumer of event/route list bindings 126 from delivery queue 130. The role of the delivery engine 108 is to dequeue these bindings and deliver the event 104 to all destinations listed in the route list 128. Delivery typically occurs through an adapter that formats the event message into a notification message understood by the corresponding target infrastructure. For example, the notification message may be delivered in MPNS format to7 phone, an iOS Device in APN (Apple Push Notification) format, an android Device in C2DM (Cloud To Device Messaging) format, a browser on a Device in JSON (Java Script Object Notification) format, HTTP (hypertext transfer protocol), etc.

The delivery engine 108 typically parallelizes delivery across independent targets 102 and serializes delivery across targets 102 sharing the scope implemented by the target infrastructure. An example of the latter case is that a particular adapter in the delivery engine may choose to send all events targeting a particular target application on a particular notification platform over a single network connection.

The dispatch engine 122 and delivery engine 108 are decoupled using the delivery queue 130 to allow independent scaling of the delivery engine 108 and avoid delivery slow-down back-offs and blocking the dispatch query/packing stage.

Each distribution partition 120 may have any number of delivery engine instances concurrently observing the delivery queue 130. The length of the delivery queue 130 may be used to determine how many delivery engines are concurrently active. If the queue length exceeds a certain threshold, a new delivery engine instance may be added to partition 120 to increase the transmission throughput.

The distribution partition 120 and associated distribution and delivery engine instances can scale in a virtually limitless manner to achieve large-scale optimal parallelization. If the target infrastructure is able to receive and forward one million event requests to devices in a parallel fashion, the described system is able to distribute events across its delivery infrastructure-possibly utilizing network infrastructure and bandwidth across data centers-in a manner that saturates the target infrastructure with event submissions for timely delivery to all required targets 102 as the target infrastructure is under-loaded and allowed under any granted delivery limits given.

Upon delivering a message to the target 102 via its corresponding infrastructure adapter, in some embodiments, the system logs some statistics entry. These statistics entries include measured time periods of duration between receipt of the delivery binding and delivery of any individual message and measured time periods of duration of the actual transmit operation. Another part of the statistical information is an indicator of whether the delivery was successful or failed. This information is collected within the delivery engine 108 and accumulated into averages on a range-by-range basis and on a target-by-target-application basis. The 'target application' is a grouping identifier introduced for the specific purpose of statistical information accumulation. The calculated average is sent to the delivery status queue 146 at defined time intervals. The queue is drained by a (group of) workers in the management service 142, which submits the event data to the data warehouse for various purposes. These purposes may include, in addition to operational monitoring, billing tenants to whom events are delivered and/or exposing statistical information to tenants for third party self-billing.

Upon detection of delivery errors, the errors are classified into temporary and permanent error conditions. Temporary error conditions may include, for example, a network failure that does not permit the system to reach the delivery point of the target infrastructure or the target infrastructure reporting that the delivery quota has been temporarily reached. Permanent error conditions may include, for example, authentication/authorization errors on the target infrastructure or other errors that cannot be recovered without manual intervention, as well as error conditions in which the target infrastructure reports that the target is no longer available or that the target wants to accept messages on a permanent basis. Once classified, the error report is submitted to a delivery failure queue 148. For temporary error conditions, the error may also include an absolute UTC timestamp until the error condition is expected to be resolved. At the same time, the target is locally blacklisted by the target adapter for any further local deliveries by this delivery engine instance. The blacklist may also include timestamps.

The delivery failure queue 148 is drained by a (group of) workers in the administrative role. Permanent errors may cause the respective target to be immediately deleted from its respective distribution partition store 124 that the administrative role has access to. By 'delete' is meant that the record is indeed removed or alternatively moved out of view of the lookup query only by setting the 'end' timestamp of the validity period of the record to the wrong timestamp. The temporary error condition may cause the target to be deactivated during the time period indicated by the error. The deactivation may be accomplished by moving the beginning of the target's validity period up to the timestamp indicated by the error, at which time the error condition is expected to recover.

FIG. 5 shows a system overview illustration of the coupling of the acquisition partition 140 to the distribution partition 120 through the distribution topic 144.

As described above, in some embodiments, the generic events 104 may be created from information from the source 116. The generic events may be generated in a generic format such that later, data may be generated and placed in a platform-specific format. A number of examples of expressions that may map generic event attributes implemented in one embodiment to platform-specific notifications are now shown below.

A reference to an event attribute with a given name (name) by $ (name) or > (name). The attribute names are not case sensitive. An attribute name may be a "point" expression (e.g., attribute. item) if the referenced attribute's vector contains complex type data in the form of a JSON string expression. The expression resolves to a text value for the attribute or to an empty string in the absence of the attribute. This value may be pruned according to the target size constraint of the target field.

$ (name, n) is like above, but the text is explicitly clipped at n characters, e.g., $ (title,20) clips the content of the title attribute at 20 characters.

(name, n) is as above, but the text is suffixed by three points because it is clipped. The total size of the pruned string and suffix will not exceed n characters. (title,20) with an input attribute of "this is a title line" produces "this is title …'.

% (name) like $ (name), the only output is the encrypted URI.

The $ body refers to the physical body of the event. Entity bodies (entity bodies) are not pruneable because they may contain arbitrary data including binary data and traverse existing systems. If $ body is mapped to a text attribute on the target, the mapping will in some embodiments succeed only if the ontology contains text content. If the entity ontology is empty, the expression resolves to an empty string.

$ count refers to the count per target of events delivered from a given source. The expression resolves to a number calculated by the system that indicates how many messages the corresponding target has received from the source 116 since the last request to reset the counter. In some exemplary embodiments, the number has a range from 0 to 99. Having reached 99, the counter is not incremented. This value is typically used for badges and title counters.

A '[. text. ]' or "[. text. ]" is a character. The words contain any text encapsulated in a single quotation mark or double quotation marks. The text may contain special characters in an escape form according to escape rules. (see ECMA-262,7.8.4) expr1+ expr2 is a concatenation operator that adds two expressions to a single string. The expression may be any of the above.

expr1 expr2 is a conditional operator that evaluates expr1 if expr1 is not a 0 or 0 length string and evaluates expr2 otherwise. The operator has a higher priority than the + operator, i.e., expressing 'p' + $ (a) $ (b) will produce a value for a or b prefixed with the word 'p'.

Embodiments may use a mapping language to retrieve attributes from events 104 and map them to the correct locations for notifications on target 102:

title notifications for Windows phones may also utilize the $ count attribute of auto-track counts.

For an iPad application, an embodiment may map it to a reminder as shown below:

or just apply badges (counters) on the chart:

in some embodiments, the default of these mappings is: each target attribute is mapped to an input attribute having the same name. Thus, an embodiment may simply target a Windows phone as follows:

and Text1, Text2, and Param will automatically map from message attributes with the same name on the input event-and will remain empty (they will not be sent) if such attributes do not exist. This allows full source-side attribute control-just like Windows Azure-for-when the source 116 is under developer control^TMService bus queues and topic subscriptions are typically the same.

For Google Android (Google Android), the mapping is somewhat different because the C2DM service does not define a fixed format for notifications and does not have immediate bindings into the Android user interface shell, so that the mapping takes the form free property bag (bag) with the target property as a key and the expression as a value. If PropertyMap is omitted, all input properties are mapped straight through to the C2DM endpoint.

Selective notification distribution

The embodiments described herein may implement functionality that allows a notification target 102 in a broadcast system to subscribe to any event stream that provides the following criteria: the criteria allow for selective distribution of events from the event stream to the targets based on geographic, demographic, or other criteria.

Specifically, the event data may have various classification data. For example, an event may be geo-tagged. Alternatively, events may be classified by the source, such as by a category string that includes the event.

Referring again to FIG. 1 and as described above with reference to the various figures, events 104 may include various types of classification data. For example, the event may include a geo-tag, wherein geo-coordinates are included in the reminder. The distribution engine 122-1 may examine events for geo-tagged data. The distribution engine 122-1 may also examine the database 124-1 to determine the objects 102 that are interested in the data having the geo-tag. For example, the user may specify their location, or approximate location. The user may specify: any alerts that are relevant to their location or within 5 miles of their location should be delivered to the user. The distribution engine 122-1 may determine: whether the geotag in the data falls within the designation. If so, the distribution engine 122-1 may include the particular user in the routing list 128-1 of the event 104. Otherwise, the user may be excluded from the routing list and will not receive notifications with reminders 104.

For geo-tagged data, a user (or other entity controlling notification and event delivery to the user) may specify a number of non-tagsAny of the same boundaries. For example, specifying any location within 5 miles from a given location basically specifies a certain point and a certain circle around the point. However, other embodiments may include specifying: geopolitical boundaries such as cities, states, countries or continents; the shape of a building or complex, etc. SQL Server from Microsoft corporation of Redmond, WashingtonHas a geospatial function that may be used as part of the distribution partition 120-1 to determine the target 102 to deliver an event.

Generally, event data may include classification information. For example, strings included in the event may classify the event data. The inclusion of the destination in the routing list 128-1 may be based on the user selecting to join a category or selecting to exit a category. For example, the target 102-1 may elect to participate in a category, and the category string may be compared with respect to the event 104-1. If event 104-1 includes a string indicating the type of opt-in, then target 102-1 will be included in the route list 128-1 of bundle 126-1 so that notifications with data from event 104-1 will be delivered to target 102-1.

Badge counter

Certain embodiments described allow individual counters to be tracked in an event broadcast system without requiring separate tracking of each end user's counter. This may be done by a server receiving a series of events, where each event in the series is associated with a list of timestamps. The timestamp list for each event includes the timestamp of the event and the timestamps of all previous events in the series.

The user sends a timestamp to the server. The timestamp is an indicator of when the user performed some user interaction at the user device. For example, the timestamp may be an indication of when the user opened an application on the device. The server compares the timestamp sent by the user with a list of timestamps for events that are about to be sent to the user. The server counts the number of time stamps of an event to be transmitted to the user, which occurs after the user transmits the time stamp, in the time stamp list, and transmits the count as a badge counter.

An example is shown in figure 6. Fig. 6 shows the target 102. The target 102-1 receives the event 104 and the badge counter 106 from the delivery engine 108-1. The target 102-1 sends the timestamp 110 to the delivery engine 108-1. The timestamp 110 sent by the target 102-1 to the delivery engine 108-1 may be based on some action at the target 102-1. For example, a user may open an application associated with the event 104 and badge counter 106 that the delivery engine 108-1 sends to the target 102-1. Opening an application may result in a timestamp 110 being sent from the target 102-1 to the delivery engine 108-1 indicating what the application was opened.

Delivery engine 108-1 receives a series 112 of events (shown as 104-1, 104-2, 104-3, and 104-n). Each event in the series 112 of events is associated with a list of timestamps 114-1, 114-2, 114-3, or 114-n, respectively. Each list of timestamps includes a timestamp of the current event and a timestamp of each event prior to the current event in the series. In the illustrated example, event 104-1 is a first event sent to delivery engine 108-1 for delivery to target 102. Thus, the list 114-1 associated with the event 104-1 includes a single entry T1 corresponding to the time that the event 104-1 was sent to the delivery engine 108-1. Event 104-2 is sent to delivery engine 108-1 after event 104-1, and thus event 104-2 is associated with list 114-2 including timestamps T1 and T2 corresponding to when events 104-1 and 104-2 were sent to delivery engine 108-1, respectively. The event 104-3 is sent to the delivery engine 108-1 after the event 104-2, and thus the event 104-3 is associated with a list 114-3 that includes timestamps T1, T2, and T3 corresponding to when the events 104-1, 104-2, and 104-3 were sent to the delivery engine 108-1, respectively. The event 104-n is sent to the delivery engine 108-1 after the event 104-3 (and possibly a number of other events as indicated by the ellipses in the list 114-n), and thus the event 104-n is associated with the list 114-n that includes timestamps T1, T2, T3 through Tn corresponding to when the events 104-1, 104-2, 104-3 through 104-n were sent to the delivery engine 108-1, respectively.

Assume that the target 102-1 does not send any timestamp 110 to the delivery engine 108-1. When the delivery engine sends event 104-1, it also sends a badge counter with a value of 1 corresponding to T1. When the delivery engine sends event 104-2, it also sends a badge counter with a value of 2 corresponding to the counts of the two timestamps T1 and T2. When the delivery engine sends event 104-3, it also sends a badge counter with a value of 3 corresponding to the counts of the two timestamps T1, T2, and T3. When the delivery engine sends an event 104-n, it also sends a badge counter of value n corresponding to a count of n timestamps T1 through Tn.

Now assume that the target sends a timestamp 110 with an absolute time that occurs between times T2 and T3. At this point it is possible that events 104-1 and 104-2 have been delivered to target 102-1. When the event 104-3 is sent to the target, the delivery engine 108-1 only counts the timestamps that occur after the timestamp 110 when determining the value of the badge counter. Thus, in this scenario delivery engine 108-1 sends a badge counter with a value of 1 corresponding to T3 (because events T1 and T2 were sent before timestamp 110) along with event 104-3. This process may be repeated using the most recent timestamp 110 received from the target 102-1 that was used to determine the badge counter value.

The following discussion now refers to various methods and method acts that may be performed. Although the various method acts are discussed in, or illustrated in, a particular order by flowcharts that occur in the particular order, the particular order is not required unless explicitly stated, or required because an act is dependent on another act being completed before the act is performed.

Referring now to FIG. 7, a method 700 is shown. The method includes an act of delivering an event to a consumer. Method 700 includes accessing private data (act 702). For example, each of the sources 116 may provide data in a proprietary format that is unique to a different source 116.

The method 700 further comprises: the private data is normalized to create a normalized event (act 704). For example, as indicated above, the events 104 may be normalized by normalized private data from different sources 116.

The method 700 further comprises: a number of end consumers that will receive the event based on the subscription is determined (act 706). For example, as shown in FIG. 2, the distribution engine 122-1 may query the database 124-1 to determine what users at the targets 102 have subscribed to.

The method 700 further comprises: the data from the normalized event is formatted into a plurality of different formats that are appropriate for all of the determined end consumers (act 708). For example, as shown in FIG. 1, the normalized event may be specifically formatted into a format suitable for the respective target.

The method 700 further comprises: the data from the normalized event is delivered to each of a plurality of end consumers in a format appropriate to the end consumer (act 710).

Further, the various methods may be implemented by a computer system including one or more processors and a computer-readable medium, such as computer memory. In particular, the computer memory may store computer-executable instructions that, when executed by the one or more processors, cause various functions to be performed, such as the various acts described in the embodiments.

Embodiments of the present invention may comprise or utilize a special purpose or general-purpose computer including computer hardware, as discussed in greater detail below. Embodiments within the scope of the present invention also include physical and other computer-readable media for carrying or storing computer-executable instructions and/or data structures. Such computer-readable media can be any available media that can be accessed by a general purpose or special purpose computer system. The computer-readable medium storing the computer-executable instructions is a physical storage medium. Computer-readable media bearing computer-executable instructions are transmission media. Thus, by way of example, and not limitation, embodiments of the invention can include at least two significantly different computer-readable media: physical computer-readable storage media and transmission computer-readable media.

Physical computer storage media includes RAM, ROM, EEPROM, CD-ROM or other optical disk storage (e.g., CD, DVD, etc.), magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store desired program code means in the form of computer-executable instructions or data structures and which can be accessed by a general purpose or special purpose computer.

A "network" is defined as one or more data links that allow electronic data to be transferred between computer systems and/or modules and/or other electronic devices. When information is transferred or provided over a network or another communications connection (either hardwired, wireless, or a combination of hardwired or wireless) to a computer, the computer properly views the connection as a transmission medium. Transmission media can include a network and/or data links which can be used to carry desired program code means in the form of computer-executable instructions or data structures and which can be accessed by a general purpose or special purpose computer. Combinations of the above are also included within the scope of computer-readable media.

Furthermore, program code means in the form of computer-executable instructions or data structures can be transferred automatically from transmission computer readable media to physical computer readable storage media (or vice versa) upon reaching various computer system components. For example, computer-executable instructions or data structures received over a network or a data link may be cached in RAM within a network interface module (e.g., a "NIC") and then ultimately transferred to computer system RAM and/or to less volatile computer-readable physical storage media at a computer system. Thus, a computer-readable physical storage medium may be included in a computer system component that also (or even primarily) utilizes transmission media.

Computer-executable instructions comprise, for example, instructions and data which cause a general purpose computer, special purpose computer, or special purpose processing device to perform a certain function or group of functions. The computer-executable instructions may be, for example, binaries, intermediate format instructions such as assembly language, or even source code. Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the features and acts described above are disclosed as example forms of implementing the claims.

Those skilled in the art will appreciate that the invention may be practiced in network computing environments with many types of computer system configurations, including, personal computers, desktop computers, laptop computers, message processors, hand-held devices, multi-processor systems, microprocessor-based or programmable consumer electronics, network PCs, minicomputers, mainframe computers, mobile telephones, PDAs, pagers, routers, switches, and the like. The invention may also be practiced in distributed system environments where local and remote computer systems, which are linked (either by hardwired data links, wireless data links, or by a combination of hardwired and wireless data links) through a network, both perform tasks. In a distributed system environment, program modules may be located in both local and remote memory storage devices.

The present invention may be embodied in other specific forms without departing from its spirit or characteristics. The described embodiments are to be considered in all respects only as illustrative and not restrictive. The scope of the invention is, therefore, indicated by the appended claims rather than by the foregoing description. All changes which come within the meaning and range of equivalency of the claims are to be embraced within their scope.

Claims

1. A method of delivering an event (104) to a consumer, the method comprising:

accessing private data (702);

normalizing the private data to create a normalized event (104) (704);

determining a plurality of end consumers (706) that will receive the event (104) based on the subscription;

formatting (708) data from the normalized events (104) into a plurality of different formats appropriate for all of the determined end consumers; and

delivering data from the normalized events (104) to each of the plurality of end consumers in a format suitable for the end consumer (710).

2. The method of claim 1, wherein accessing private data comprises: data is accessed from multiple sources.

3. The method of claim 1, wherein delivering the data from the normalized event to each of the plurality of end consumers in a format appropriate for the end consumer comprises: data from the event is first fanned out in the normalized format.

4. The method of claim 1, wherein delivering the data from the normalized event to each of the plurality of end consumers in a format appropriate for the end consumer comprises: packaging the event into a plurality of bundles, wherein each of the bundles includes the event in the normalized format and a routing list that identifies a plurality of end consumers, including identifying a format for end users identified in the routing list.

5. The method of claim 4, wherein packaging the event into a plurality of bundles comprises: querying a database to determine by reference to the final consumer preferences in the database: which end users are included in the routing list.

6. The method of claim 1, wherein normalizing the private data to create normalized events comprises: representing the data as key-value pairs, the pairs being accompanied by a single opaque binary data block that is not further interpreted by the event normalization system.

7. The method of claim 1, wherein formatting the data from the normalized event into a plurality of different formats that are appropriate for all of the determined end consumers comprises: mapping one or more attributes from the normalized event to a format by mapping message attributes having the same name.