GB2557877A

GB2557877A - Changing URL generation

Info

Publication number: GB2557877A
Application number: GB1609426.0A
Authority: GB
Inventors: Knight Patrick
Original assignee: Individual
Current assignee: Individual
Priority date: 2016-05-27
Filing date: 2016-05-27
Publication date: 2018-07-04
Also published as: GB201609426D0

Abstract

Web server 101 receives a request for a web page from a client, identifies that the webpage includes content requiring the client to be provided with an identifier, generates the identifier and provides it to the client. The identifier includes data that changes from an identifier generated in response to one client request to an identifier generated in response to another client request. On determination that the client has not requested the resource using the identifier, access to a website associated with the web page is restricted, possibly until the client prevents a filter from blocking resources being downloaded or makes a subscription payment. The identifier may include an address associated with a source of the web page or a data string that is formed by encrypting a Universal Resource Locator or address associated with the resource. Links may be embedded within e.g. an online newspaper web page that trigger the client to request advertising content from third party provider 104. May use addresses that are not identifiable to an ad blocker or client-side filter to circumvent software which recognises URLs and HTML elements that are associated with adverts or browsing trackers embedded within web pages.

Description

(56) Documents Cited:

None (71) Applicant(s):

Patrick Knight

Clock House, Sopley, Christchurch, Hampshire, BH23 7AT, United Kingdom (58) Field of Search:

Other: No search performed: Section 17(5)(b) (72) Inventor(s):

Patrick Knight (74) Agent and/or Address for Service:

Slingsby Partners LLP

Kingsway, LONDON, WC2B 6AN, United Kingdom (54) Title of the Invention: Changing URL generation

Abstract Title: Providing identifiers e.g. encrypted URLs to a client in response to a web page request (57) Web server 101 receives a request for a web page from a client, identifies that the webpage includes content requiring the client to be provided with an identifier, generates the identifier and provides it to the client. The identifier includes data that changes from an identifier generated in response to one client request to an identifier generated in response to another client request. On determination that the client has not requested the resource using the identifier, access to a website associated with the web page is restricted, possibly until the client prevents a filter from blocking resources being downloaded or makes a subscription payment. The identifier may include an address associated with a source of the web page or a data string that is formed by encrypting a Universal Resource Locator or address associated with the resource. Links may be embedded within e.g. an online newspaper web page that trigger the client to request advertising content from third party provider 104. May use addresses that are not identifiable to an ad blocker or client-side filter to circumvent software which recognises URLs and HTML elements that are associated with adverts or browsing trackers embedded within web pages.

1/8 io 3

2/8

	l^eOadA» ^eSjbOt/jV- ! b⁵ cveferde .	2α)
	V
	\|c<€AK{j voej=>	- --,
recybLV'-eo d'cAf k»	2.0 2.

φ

Cszto-refe ) dez»Kfse^r c/cU-ude dcJ-G Hwr cKv\yj (fr®An

e. j-e^eaf- M oAofhe/'

2o3

v.

Provide cVsevc UibHr- j d-ezih'fi er

3/8 ^^oi

_ilz__

3rd ₍Wtvj \ prexJ.deZ pudw άτά-Ζ/Ά 2l2

K> Wtb Vei v-eT /

---— _ t

Append k> prJoGi Vo/' (JLq/VYIZa \i/ _ foiUide (?rd par (·'€>/</ (q clic/vt·

Si?

Ja ff&vl de «ncrjph’ol Get. caJ CorshG/Vt ·

2o&

3o\

V-«5c«2- Q

4/8

5/8

6/8

ΟΌί

6θ2 £ά AW-e co) Cerf con rede of (^t^ereAUr ,<riuL

60S

Set cho^<

(o

7/8 fcodc^C ftae ο^ιζ/Ό bui^ (J^C

7-oj

Oefexf ihtz/ cA&U

Out hayue/ hzfJ K sources ^pyv> fHe cKcvnjL·^ (J^· ^Che/xt /e^uc^oijr adchih&sP^ ^S'oU'Cjlt» „V j^ecLrect' cXte/νΛ io hoblZ/y voeJo fpc^e,,

^2.

Ί^2

8/8

Soil /9& ί Ce J oizce. the<Mcle Γ ολ i cte/X Fi fi $J

J/ <x c& ^rt cUtnH pi e/' LaaX^ y f CyxLo' Vvlj ^eyva/n/^ itx/vKp «Γ ^ΟΊ<602 \K

/^cnoln/v^Cy ^£.Λχ-^4<,7 idLnb'Per ocrorr Sbjk pie/&
	k	/
	/%uid e	di -eAT cxX>> \| r

%oC pcjcofte .

CHANGING URL GENERATION

This invention relates to techniques and appropriately configured apparatus for circumventing client-side filters, such as ad blockers.

To access a web page involves sending a request to the web server identified by the uniform resource locator (URL) associated with that web page. The URL identifies the protocol that the request is being made under (which will be the hypertext transfer protocol or http in most instances), the name of the web server hosting the web page and some indication of the content that is being requested. The content returned by the web server may include text, images, sound, video etc. It may also include links to content provided by another web server. In many instances this content may be considered a sub-resource: it is designed to be slotted into a host web page rather than being capable of functioning as a standalone web page. This mechanism is frequently used to incorporate adverts within a host web page. The host web server provides the requested web page, such as the page of an online newspaper, and links are embedded within that web page that trigger the client to request advertising content from third party web servers.

Many users find unsolicited adverts annoying and the use of ad blocking software is growing. Ad blocking software works by recognising URLs and HTML elements that are associated with advertising content and blocking the client from requesting that content. Effectively they act as a client-side filter. One problem with the growing use of ad blockers is that it is the revenue from advertising that often pays for the web content that users want to view. There is a risk that the routine use of ad blockers will seriously damage the advertising model of funding web content.

According to a first aspect, there is provided a method comprising receiving a request for a web page from a client, identifying that the web page includes a resource that requires the client to be provided with an identifier, generating the identifier to include data that changes from an identifier generated for that resource responsive to one client request for said web page to the identifier generated for that resource responsive to another client request for said web page and providing the client with the generated identifier.

Other aspects may include one or more of the following:

The method may comprise generating the identifier to include an address associated with a source of the web page. The method may comprise generating the identifier to comprise a data string that is formed by encrypting an address associated with the resource. The data string may be generated by encrypting the address associated with the resource with a key and using a different key to generate an address for providing to the client responsive to one client request than to generate an address for providing to the client responsive to another client request. The identifier may be generated to include the key used to generate the data string. The identifier may be generated to include a data string that is representative of a third party source of the resource.

The method may comprise receiving a request for the resource from the client, said request including the generated identifier, recognising from the generated identifier that the request relates to the resource and providing the client with the resource. The method may comprise recognising that the generated identifier includes an encrypted data string and decrypting said data string to generate the address associated with the resource. The method may comprise recognising that the generated identifier incorporates the key used to encrypt the data string and decrypting the data string using that key.

The method may comprise recognising from the generated identifier that the resource is provided by a third party source and forwarding the request to said third party source. The method may comprise the third party source providing the resource to the source of the web page and the source of the web page providing the resource to the client.

The method may comprise generating the identifier to designate an attribute associated with content comprised in the attribute. The resource may comprise content and information defining the attribute, and the method may comprise assigning the generated identifier to the content and to the information defining the attribute.

The method may comprise determining whether the client is associated with a valid subscription to the web page. The method may comprise, if the client is associated with a valid subscription to the web page, either: not providing the client with an identifier associated with the resource; or providing the client with an address associated with the resource but generating that identifier to be the same as an identifier generated for that resource responsive to another client request for said web page

The method may comprise providing the generated address as part of the requested web page. The method may comprise determining that the client has not requested the resource using the identifier and restricting the client’s access to a web site associated with the requested web page responsive to that determination.

According to a second aspect, there is provided a web server configured to receive a request for a web page from a client, identify that the web page includes resource to be provided to the client by way of providing the client with an address, generate the address to include data that changes from an address generated for that resource responsive to one client request for said web page to the address generated for that resource responsive to another client request for said web page and provide the client with the generated address.

According to a third aspect, there is provided a method comprising receiving a request for a web page from a client, identifying that the web page includes a resource that requires the client to be provided with an identifier, providing the client with the identifier, determining that the client has not requested the resource using the identifier and restricting the client’s access to a website associated with the requested web page responsive to that determination.

Other aspects may include one or more ofthe following:

The method may comprise restricting the client’s access to the website by providing the client with a non-requested web page in response to a further request from the client for a web page. The method may comprise restricting the client’s access to the website until the client performs one or more of the following: preventing a filter at the client from blocking resources from being downloaded or displayed by the client; and accepting a charge for viewing the web site without the resource.

The present invention will now be described by way of example with reference to the accompanying drawings. In the drawings:

Figure 1 shows an example of a web server for communicating with a client and a third party content provider;

Figure 2 shows an example of a method for providing a client with web page content;

Figure 3 shows an example of a method for protecting web content from being blocked by a client-side filter;

Figure 4 shows an example of a message exchange between a client, a web server and a provider of third party content;

Figure 5 shows an example of a method for associating a subscription with advert free content;

Figure 6 shows an example of a method for testing subscription price points;

Figure 7 shows an example of a method of responding to a client-side filter that might be blocking a changing URL; and

Figure 8 shows an example of a method for changing a content attribute from one client request to another.

An example of a web server is shown in Figure 1. The web server 101 is configured to receive requests for web pages from a client 103 via the internet 102. The web server includes a communications unit 105 for communicating with clients and third parties via the internet. The communications unit is particularly configured to receive web page requests and provide clients with requested resources. The web server also comprises a request processing unit 106 and a content provisioning unit 107.

The content provisioning unit is suitably configured to provide clients with requested web page resources. It may have access to a memory, such as content store 108, from which it can access the resources to build a requested web page. The resources may be separated into formatting and content information, so that the actual content of a resource is separated from information defining how it should be presented.

In some scenarios not all of the content associated with a given web page may be provided by the web server. Clients may also be directly or indirectly provided with content by third party provider 104, which may also be implemented by a web server. In some implementations these resources may take the form of a web page fragment or sub-resource. These resources are likely to form a particular section of the web page, with a designated location on the page. A third party resource could, for example, form an advertising banner. Some example implementations are described below in which the resource provided by the third party is advertising or trackingrelated but it should be understood that the methods and apparatus described herein are not limited to this. A third party resource could relate to any content for a web page.

Figure 2 illustrates an example of a method for providing web content via an identifier, which may be embedded within a requested web page.

In step 201, the web server receives a request for a web page, e.g. at communications unit 105. This request is likely to have been triggered by a user of the client computer entering an address associated with web server 101 into their web browser. An address identifies a source of internet content, in this instance a web page. In most implementations, addresses will take the form of URLs (Universal Resource Locators). The explanations below will frequently refer to addresses by the acronym “URL”. It should be understood that this is for the purposes of example only and any suitable form of address that serves to identify a source of content/resources could be used.

In step 202, it is identified as that the web page includes a resource that requires the client to be provided with an identifier associated with that resource. The identifier may be any name associated with the resource. The identifier may be any data that designates a source or attribute of the resource. In some embodiments the identifier may be an address. Providing the client with the address may trigger it to request the resource from that address. The identifier could be a name associated with the resource by the designer of the web page. For example, the identifier may designate some attribute or style associated with the resource. In some embodiments the resource may be provided by the content provisioning unit 107 within web server 101. In other embodiments the resource may be provided by third party content provider 104 (although it may often be the web server 101 that ultimately forwards that resource to the client).

In step 203, the web server generates an identifier to send to the client. The identifier may be generated by an identifier generator 112 within a filter circumvention unit 111. The web server preferably generates the identifier to be different from identifiers generated for other client requests, even though those client requests are for the same web page and the identifier represents the same resource. This makes it very difficult for a client-side filter to stop the client from displaying and/or downloading the resource because the identifier changes each time. The filter does not know what identifier to block.

In one implementation the identifier that is provided to the client is generated from an address that is already associated with the resource - such as a URL. The reason for generating the address is to trigger the client to download content from it. While the address that is provided to the client changes from one client request to another, the resource is always associated with the same address at the web server. The resource can thus still be retrieved from the same internet location.

In another implementation the identifier that is provided to the client relates to an attribute ofthe resource. In one example the attribute may relate to how content within the resource is formatted. The resource may comprise two identifiers: one for the content ofthe resource and one for a formatting file that accompanies the content. The two identifiers are preferably both changed from one client request to the next so that they match each other. Thus while the identifiers change from one client request to another the resource itself remains substantially unchanged because the content and its associated formatting remain the same.

In step 204 the generated identifier is provided to the client. Providing the client with this changing identifier should cause the user to be presented with the complete web page via their internet browser, irrespective of any client-side filters that the client may have installed.

Some techniques for generating different identifiers in response to different client requests for web resources will now be described with reference to a number of practical examples.

Third-Party Adservinq Solution

In this example advertising content is to be downloaded from a third party provider. An advantage of applying the address-changing technique described above within this context is that it enables the existing eco-system surrounding the provision of advertising to be maintained. Advertising content can still be served from a third party ad provider rather than becoming the responsibility of the publisher. Any client-side ad blocker software can be circumvented.

In this example the client may request access to a newspaper website at:

http://daHygossjp.com

The Daily Gossip makes its online content available for free. This is paid for by adverts that surround the premium content of the page.

Many people find online adverts annoying and have downloaded ad blocking software to act as a client-side filter. Ad blockers usually work by preventing the browser from loading content from specific sources. The block may apply to entire domains (e.g. block anything coming from doubleclick.net) or to particular address patterns (e.g. block any connection to an URL that starts with /ad/). This works because the advert sources are associated with recognisable addresses, either because they are the same each time or because they follow a recognisable pattern. The advert sources are also associated with addresses that are different from that of the publisher content, so the advert source addresses can be blocked without preventing the user from viewing the publisher content. Hence the user effectively gets to read the premium publisher content “for free” by being able to view it without any adverts.

The Daily Gossip is reliant on advertising revenue to pay the staff that generate content for its site. It either wants to prevent users from blocking adverts or charge users for viewing content. One improvement can be achieved by not having the address point to a different domain from the publisher. For example, if the following URL were sent to the client:

the ad blocking software would see that it points to the publisher (the Daily Gossip in this instance) and might not move to automatically block the browser from sending a request to it. A human user might readily make the connection though and add “path/to/resource” to a list of URLs to be blocked by the ad blocking software.

A preferred approach is to circumvent ad blocking software by serving adverts from addresses that are not identifiable to the software or to the user. This can be done by changing the address that is provided to clients from one client request to another. This may be achieved by taking the domain path /path/to/resource?param1=value1 and replacing it with an unpredictable data string. One option for generating such a data string is to encrypt all or part of the URL associated with the advert. The encryption could use a different key for each client request. For example, the data string could be generated for the example above by encrypting /path/to/resource?param1=value1” using a different key each time. The URL might now look like:

http://dailygossip.com/hf7sk84jf0034kf==fs83hqp29lajqbspgu27voscutqnv0 but on a different occasion the URL might look like:

The domain path may be encrypted by placing it within a JSON (JavaScript Object Notation) object. The encryption may be symmetric, so that the same key is used for encryption of the basic text and decryption of the cypher text. In this example the key may be included within the URL provided to the client so that the encrypted data string can be readily decrypted when that URL comes to be used to retrieve the resource. For example, if key abc123 was used to encrypt the domain path /path/to/resource?param1=value1, the URL that is provided to the client may be:

http://dailygossip.com/hf7sk84jf0034kf==fs83hqp29lajqbspgu27voscutqnv0/ abc123

Similarly, on another occasion, the domain path /path/to/resource?param1=value1 may be encrypted by the key base64, resulting in the following URL being provided to the client:

http://danygosssp,com/bhankaskikU73ehbs6y6e3hxhsdaew902hjbhsxvhgd54wbzg/ba se64

Since the data string is different each time, the URL is different each time. There is no identifiable text string that can be used to block this URL on the client side. In addition, if the data string is unpredictable, there is no pattern that can be detected between one URL and the next. The ad blocker could block the domain with which these random data strings are associated, but since that domain is the publisher domain it cannot be blocked without also blocking the publisher content.

There is a small risk that even encrypted URLs might be recognised by a client-side filter or a human user, particularly if the encryption algorithm used is one that generates data strings having a particular format (e.g. the encryption algorithm outputs a data string that is always the same number of characters in length). One option for addressing this is to generate the data strings such that they do not have a consistent format from one client request to another. Another is to detect that a client-side filter is blocking the client from making requests to the encrypted URL and block the client from accessing the publisher website accordingly. This option is described in more detail in the “Blocked Encrypted URLs Solution” below.

Referring to Figure 1, the request processing unit 106 within web server 101 is suitably configured to recognise that some parts of a requested web page include a resource that is not within the jurisdiction of the web server. That resource will not be provided by the content provisioning unit 108. Instead it is the responsibility of third party provider 104. The web server is suitably configured to encrypt the URL associated with the resource and return the encrypted URL to the client. This will trigger the client to request the third party resource from the publisher server. The encrypted URL may be generated by filter circumvention unit 111.

The filter circumvention unit 111 may be implemented by software configured to run on the web server. This software may also be configured to act as a catch-all URL handler, e.g. by taking control of the web server’s 404 handling logic. This is represented by address handler 114 in Figure 1. Filter circumvention unit 111 and address handler 114 may together be considered as an “ad pass” tool. This tool may be provided to the publisher by a third party software provider.

More detailed examples of how the ad pass tool may operate are shown in Figures 3 and 4.

Figure 3 relates to how requests for a URL might be handled by a web server. A request for a web page is received by the ad pass tool in step 301. In step 302 it is determined by address handler 114 whether the URL points to a valid publisher resource - such as an article, an image, video, script, stylesheet etc. In one example the validity of a resource may be determined based on a JavaScript Object Notation (JSON) object that contains the requested URL. If the answer is yes, the request is passed to the main part of the publisher to provide the requisite content (step 303). The request processing unit 106 may then determine whether the request involves content from a third party provider 104. If the answer is no, only publisher content is returned to the client 103 (step 305). If the answer is yes, address generator 112 selects a new key (step 306) and encrypts a data string associated with the third party content to be provided in response to the request (step 307). This random data string is appended to the publisher domain (step 308) and returned as a random URL to form part of the content provided to the client (step 309).

If the address handler 114 determines that the URL points to an invalid resource (step 302) it may pass the request to the address decoder 113 to decrypt the data string using the appropriate key. The filter circumvention unit 111 suitably identifies the correct key, e.g. in dependence on a session token from the client making the request or a key included in the URL itself. The decrypted data string suitably identifies the third party content provider. It may be passed to that third party provider by the ad pass tool, along with information to enable the third party provider to build the appropriate response to the client request, such as parameters, cookies etc (step 311). The third party content provider and the web server preferably communicate directly via their respective communication units 105, 115 (e.g. over the internet) rather than through the client browser. The third party content provision unit 114 provides the required content to the web server (step 312). This content could take any form, including an image, a script, an entire HTML document etc. The web server returns this third party content to the client in step 313.

An example of the exchange of messages that may occur between the client device, the web server and the third party content provider is shown in Figure 4. In step 401 the client requests a publisher web page from the publisher web server. This web page incorporates third party content so in step 402 the publisher web server returns an encrypted URL. This triggers the client to request the resource associated from the encrypted URL from the publisher web server (step 403). The publisher web server retrieves this resource from the third party web server (steps 404 and 405) and returns it to the client (step 406).

The third party content provider may be provided with software that is specifically configured to enable the third party server to interact with the ad pass tool.

In the examples described above all of the functions of the ad pass tool are implemented on the publisher web server. However, any or all of these functions could equally be implemented on a separate server, which might be termed the “ad pass” server.

Paywall Solution

There is often a tension between users’ annoyance at online advertising and their expectation that all web content should be free. Figure 5 illustrates a method for handling this trade off, by presenting an example of how applying the addresschanging technique described above can be used to help protect premium web content. This has the advantage of enabling a more sophisticated approach to protecting the revenue stream of premium web content providers than simply blocking the client from viewing the premium content at all. In this example the user is presented with a choice: to view the web page with adverts or pay a fee to view the page advertfree.

In step 501, a request for a web page is received at a web server. The ad pass tool detects the presence of an ad blocker. If there is no ad blocker then the ad pass tool takes no further action. If an ad blocker is detected then the ad pass tools takes the following action. In step 502 it is checked whether the request is associated with a valid subscription to view the website associated with that web page advert free. This may be handled by a subscription unit 110 within web server 101. Subscriber details may be stored in a subscriber database 109. The user may have to verify their identity. For example, by entering a username and/or password. If the user is found to already have a valid advert-free subscription, the method may proceed to step 505 in which the client is provided with the normal publisher content. The client is then free to block any adverts in that content. If the request is not associated with a valid subscription, the client is redirected to an “ad pass purchase” page and asked whether the user wants to pay a fee to view the publisher content advert-free. If the user accepts, the method proceeds to step 505, as before. If the user declines, the method proceeds to step 504 and enables filter circumvention unit 111, so that advertising content is provided via changing URLs that cannot be readily blocked by an ad blocker. Alternatively, based on the preference of the publisher, the user may simply be blocked from browsing the website.

The publisher web server can return content free of adverts simply by omitting to embed links to those adverts in the content it returns to the client. An alternative would be to continue to embed those links but without enabling filter circumvention unit 111, so that any adverts links that are returned are readily blocked by an ad blocker.

A payment to view a website advert free could take the form of a one-off fee or it could be a repeated payment, such as a monthly subscription. A one-off fee could entitle a user to view the website for a particular period of time and/or a particular number of times. Each user who has a subscription may be associated with a token that can be used to monitor how many times they have accessed the website.

The subscription unit may be configured in some implementations to vary the fee that it offers users in order to try to determine an optimum price point at which revenues for the publisher are likely to be maximised. The subscription unit may, for example, include a comparison unit 116 for presenting different options to different users and keeping track of which options result in users taking up a subscription. The different options presented to users could include different price points and/or different subscription models, e.g. one-time access or multiple-time access. The comparison unit may be configured to implement any suitable form of statistical hypothesis testing, including A/B testing, bucket testing or split-run testing. The comparison unit may be configured to run a test only when triggered by a user of the web server or it may be configured to automatically perform testing at predetermined intervals.

An example of a method for determining an appropriate price point for publisher content is shown in Figure 6.

In step 601 the web server receives a request for a web page from a user. In step 602 the web server, via comparison unit 116, selects a price point for the user to view the web page. This price point may be varied from one client request to another. For example, one user may be offered one price point (e.g. £4 a week) while another user is offered a second price point (e.g. £5 a week). The different price points may also include different subscription models. . The user’s selection may be recorded (step 603) and this process may be repeated multiple times for different users and across two or more different price points (step 604). The conversion rate is then calculated for different price points (step 605) and this information allows the best price point from a conversion and revenue perspective to be selected for a web site.

The comparison unit 116 may be configured to run A/B testing software to implement the process shown in Figure 6.

Blocked Encrypted URLs Solution

Address handler 114 may be configured to monitor whether a client that has been provided with a changing URL subsequently requests a resource using URL. If no such request is received, it could be an indication that a client-side filter is blocking the changing URL. The client may thus be displaying publisher content without adverts. An example of a technique for addressing this situation is shown in Figure 7.

In step 701 the publisher server provides the client with a requested resource and an encrypted URL. In step 702 the ad pass tool detects that client has not requested resources from the encrypted URL. The address handler may be configured to make this determination if no request has been received within a predetermined period of time from the client being provided with the encrypted URL. In step 703 the client requests additional resources from the publisher server. For example the user may have clicked on a link on the publisher home page. This request is detected by address handler 114. (In an alternative implementation it may be the request for additional resources that triggers the determination that the client has not made any requests associated with the encrypted URL - thus reversing the order of steps 702 and 703). The client is then redirected to a holding web page rather than being provided with the requested resource (step 704). The holding web page might suitably be provided by a content provision unit 117 and associated content database 118 within the subscription unit. The holding web page may be a static web page. This web page may include a message informing the user that they have been blocked from accessing the requested resource because it has been detected that they have a client-side filter (e.g. an ad blocker) installed. The web page may instruct the user to add the publisher domain to a safe list in their client-side filter if they want unrestricted access to the publisher web site. The web page may also offer the user the opportunity to take out a subscription to the publisher web site, e.g. as described above in the “Paywall Solution” section.

This technique could also be implemented independently of the other techniques described herein, i.e. without having to change an identifier provided to the client from one request to another. For example, this technique could be used to detect when a particular resource has not been requested from the publisher server, and restrict further access to the publisher web site accordingly.

Another option is for the ad pass tool to change its encryption format if it detects that the client is not requesting resources from the encrypted URL. The ad pass tool may have multiple encryption formats available to it. Changing the encryption format may enable the ad pass tool to overcome its previous format being blocked.

Publisher Adserving Solution

Client-side filters could be employed to block selected content provided by the publisher and not just content provided by third party content providers. For example, a scenario where advertising content is provided by the publisher itself. The publisher could serve advertising content by replicating the mechanisms described above, namely by providing the client with an address to trigger a further request for content from the publisher server. It is more likely, however, that any advertising content will simply be incorporated with the other content that the publisher provides to the client in response to the web page request.

Client-side filters would have to employ different tactics to block content that is provided by the publisher. One possibility would be to recognise an identifier associated with specific content and then block the client from displaying that content to the user. Any suitable identifier could be used. A straightforward option would be to pick an attribute that links specific content with a particular context, so that the desired content to be blocked can be readily identified. This would not necessarily be difficult. It is acknowledged good practice for web designers to assign attributes to web elements that describe a semantic purpose of those elements rather than just their intended display purpose. The ad pass tool may therefore be advantageously configured to change the name of one or more resource attributes from one client request to another.

In one example the attribute may associate content with a particular presentation style. For example, the attribute may be the name of a Cascading Style Sheet (CSS). A CSS describes the presentation of a document written in mark-up language. That presentation might be visual, e.g. including properties such as font style or colour. It might also be non-visual, e.g. if the content is rendered in speech. Multiple styles can be grouped together and saved as a class.

A resource may include both content and its associated formatting information. Content is often grouped together, e.g. by using span and div elements. The whole group can then be associated with one or more attributes. An example is the div element below, which contains an advert for a bakery:

<div class = “bakery advert”

Half price bread at Bob’s bakery!

</div>

The class “bakery advert” is an attribute associated with the content “Half price bread at Bob’s bakery!”. A client-side filter could be configured to block any content associated with the class name “bakery advert” or even any class name containing just “ad” from being displayed. To counteract this, the ad pass tool may be configured to change the name of a class (or other identifier) from one client request to another.

An example of a technique for changing an identifier associated with web content from one client request to another is shown in Figure 8. The process starts in step 801 with the publisher server receiving a request for a web resource. In step 802 it is determ ined that the resource includes an identifier. Usually the identifier will be one that has been identified as being at risk of being blocked by client side filters. In step 803 the identifier is replaced by a new identifier. The new identifier is suitably generated by the filter circumvention unit 111. It could be generated using an encryption method, as described above with reference to URLs. In other embodiments the identifier could just be randomly generated. In step 804 the new identifier is copied across any style files that also form part of the resource. So, returning to the example above, if the class “bakery advert” is replaced by a class “pineapple”, the CSS file that accompanies the div element is preferably also renamed “pineapple”. This is so that the client will associate the appropriate content with the appropriate CSS. The requested resource is then returned to the client in step 805.

The techniques described above could be implemented across all resources provided by a publisher website or just across resources that are perceived as being particularly as risk of being blocked by client side filters. For example, only identifiers that are associated with advertising might be subject to alteration before being provided to the client.

Tracking Solution

Client-side filters could be used for other purposes than blocking content. They could also be used to block trackers that are used to monitor user’s behaviour as they view web pages. Trackers may, for example, be used to monitor what a user views and how long they view it for. This information gives useful feedback on the “viewability” of particular content.

A browsing tracker is usually some form of identifier that enables companies to track websites visited by a particular client. Often these are companies that specialise in big data and providing an analysis of that data to their customers. Examples of suitable browsing trackers include cookies, known identifiers, which are typically associated with some form of personal information such as a name or email address, and stable identifiers, which are typically associated with a specific device or browser, such as IDFAs and AdIDs.

Trackers may be embedded within web pages in a similar way to adverts. Typically a tracker is associated with a resource on a web page, which in turn is associated with a specific address or URL. This address is typically associated with a server operated by the tracking company. In a similar way to adverts, the client requests a resource from the tracking company's server in response to the embedded URL. Unlike with adverts, that resource may be invisible to the user. The resource, for example, could provide content for only a very small part of the web page, e.g. a small GIF file that provides content for just a single pixel. That pixel could be rendered transparent, so that it is visually indistinguishable from the web page.

Typically the resource is accompanied by a tracker. This tracker is returned by the client each time that the client requests a resource from the tracking company's server. This enables the tracking company to track the client across all web pages in which it has embedded a resource.

Tracking companies are vulnerable to client-side filters in a similar way to companies that server advertising content. If the client-side filter blocks the URL of the tracker company's server, then the client will not download the tracker. This issue can be addressed by using any of the techniques described above, e.g. by changing the resource address that is embedded within the web page provided to the client from one client request to another. As above, this can advantageously be done by generating an address for the resource that has the same domain name as the publisher and/or includes an unpredictable data string that can be decrypted to recover an address of the resource.

The tracking company server may be provided with similar software to that of the third party content provider described above, which will enable it to interact with the ad pass tool (whether that tool is incorporated within a publisher server or being run on its own server) in a similar way to that described above. For example, the ad pass tool may receive the tracker and resource from the tracking company server for passing to the client and pass trackers received from clients and/or information about what web page the client requested, time spent on that web page etc back to the tracking company server.

Examples of the invention have been described above with particular reference to circumventing advert blockers and tracker blockers. However, it should be understood that embodiments of the invention may be used to circumvent any type of client-side filter. The resource that is provided to the client could be any content and is not limited to advertising or tracking-related resources.

The structures shown in Figure 1 (and indeed all the block apparatus diagrams included herein) are intended to correspond to a number of functional blocks. This is for illustrative purposes only. Figure 1 is not intended to define a strict division between different parts of hardware on a chip or between different programs, procedures or functions in software. In some embodiments, some or all of the algorithms described herein may be performed wholly or partly in hardware. In most implementations, however, the functional units shown in Figure 1 (including at least the request processing unit, content provisioning unit, filter circumvention unit, address handler, address generator, address decoder, subscription unit and comparison unit) are likely to be implemented by a processor acting under software control. This processor is likely to be the CPU or DSP of a computing device, such as a web server. In some implementations the functional units may be implemented using a shared pool of configurable computing resources, such as the on-demand, distributed processing power provided by the cloud. Software for implementing the functional units described herein is preferably stored on a non-transient computer readable medium, such as a memory (RAM, cache, hard disk etc) or other storage means (USB stick, CD, disk etc).

The applicant hereby discloses in isolation each individual feature described herein and any combination of two or more such features, to the extent that such features or combinations are capable of being carried out based on the present specification as a whole in the light of the common general knowledge of a person skilled in the art, irrespective of whether such features or combinations of features solve any problems disclosed herein, and without limitation to the scope of the claims. The applicant indicates that aspects of the present invention may consist of any such individual feature or combination of features. In view of the foregoing description it will be evident to a person skilled in the art that various modifications may be made within the scope of the invention.

Claims

1. A method comprising:

receiving a request for a web page from a client;

identifying that the web page includes a resource that requires the client to be provided with an identifier;

generating the identifier to include data that changes from an identifier generated for that resource responsive to one client request for said web page to the identifier generated for that resource responsive to another client request for said web page; and providing the client with the generated identifier.

2. A method as claimed in claim 1, the method comprising generating the identifier to include an address associated with a source of the web page.

3. A method as claimed in any preceding claim, the method comprising generating the identifier to comprise a data string that is formed by encrypting an address associated with the resource.

4. A method as claimed in claim 3, the method comprising generating the data string by encrypting the address associated with the resource with a key and using a different key to generate an address for providing to the client responsive to one client request than to generate an address for providing to the client responsive to another client request.

5. A method as claimed in claim 4, comprising generating the identifier to include the key used to generate the data string.

6. A method as claimed in any preceding claim, the method comprising generating the identifier to include a data string that is representative of a third party source of the resource.

7. A method as claimed in any preceding claim, the method comprising: receiving a request for the resource from the client, said request including the generated identifier;

recognising from the generated identifier that the request relates to the resource; and providing the client with the resource.

8. A method as claimed in claim 7, comprising recognising that the generated identifier includes an encrypted data string and decrypting said data string to generate the address associated with the resource.

9. A method as claimed in claim 8, comprising recognising that the generated identifier incorporates the key used to encrypt the data string and decrypting the data string using that key.

10. A method as claimed in any preceding claim, the method comprising: recognising from the generated identifier that the resource is provided by a third party source; and forwarding the request to said third party source.

11. A method as claimed in any preceding claim, the method comprising the third party source providing the resource to the source of the web page and the source of the web page providing the resource to the client.

12. A method as claimed in any preceding claim, the method comprising generating the identifier to designate an attribute associated with content comprised in the attribute.

13. A method as claimed in claim 12, wherein the resource comprises content and information defining the attribute, the method comprising assigning the generated identifier to the content and to the information defining the attribute.

14. A method as claimed in any preceding claim, the method comprising determining whether the client is associated with a valid subscription to the web page.

15. A method as claimed in claim 14, the method comprising, if the client is associated with a valid subscription to the web page, either:

not providing the client with an identifier associated with the resource; or providing the client with an address associated with the resource but generating that identifier to be the same as an identifier generated for that resource responsive to another client request for said web page

16. A method as claimed in any preceding claim, the method comprising providing the generated address as part of the requested web page.

17. A method as claimed in any preceding claim, the method comprising: determining that the client has not requested the resource using the identifier;

and restricting the client’s access to a web site associated with the requested web page responsive to that determination.

18. A web server configured to:

receive a request for a web page from a client;

identify that the web page includes resource to be provided to the client by way of providing the client with an address;

generate the address to include data that changes from an address generated for that resource responsive to one client request for said web page to the address generated for that resource responsive to another client request for said web page; and provide the client with the generated address.

19. A method comprising:

receiving a request for a web page from a client;

providing the client with the identifier;

determining that the client has not requested the resource using the identifier; and restricting the client’s access to a website associated with the requested web page responsive to that determination.

20. A method as claimed in claim 19, comprising restricting the client’s access to the website by providing the client with a non-requested web page in response to a further request from the client for a web page.

21. A method as claimed in claim 19 or 20, comprising restricting the client’s access to the website until the client performs one or more of the following: preventing a filter at the client from blocking resources from being downloaded or displayed by the client; and accepting a charge for viewing the web site without the resource.

22. A method substantially as herein described with reference to the accompanying drawings.

23. A web server substantially as herein described with reference to the accompanying drawings.