How GOOGLE Cache your website data !!! Exposed to everyone on the internet. đł
First you must know what is web caching and how it works.
What is Web caching?
A web cache (or HTTP cache) is an information technology for the temporary storage (caching) of web documents, such as HTML pages and images, to reduce serverlag. A web cache system stores copies of documents passing through it; subsequent requests may be satisfied from the cache if certain conditions are met. A web cache system can refer either to an appliance, or to a computer program.
Web caching is a technology that allows us to access a website faster because it stores the relevant information about a website in a table called DNS Lookup table.
A DNS Lookup table is made based on the usersâ site access requests from a particular network.
Web caching is a technology that allows us to access a website faster because it stores the relevant information about a website in a table called DNS Lookup table.
A DNS Lookup table is made based on the usersâ site access requests from a particular network.
But there is an algorithm that runs behind the scenes which web page to cache and which do not.
Web caching cycle
A user request a Web page by clicking the web url. The request to access the particular web page passes through the network. The network, then, analyses the request. Each request is based on certain parameters. If those parameters are met, then the request is redirected to a local network cache.
It might also happen that the requested web page has no entry for itself in the cache. In that case, another web request is made to the original web server by the network cache or the web cache.
Upon receiving the request, the original web server delivers the required content to the cache. The network cache then delivers the content to the client.
But, it also saves the content in its local storage. That content is now cached.
Once the web page is cached, if any other user makes another request for the same web page, then the network lookup the user request and redirects the request to the local web cache.
Since the request is not sent over the internet, this process of network caching makes the content delivery faster for the associated network. Also, based on the design of the web server, the web cache keeps the cache resources updated with the time.
Now, What?
How Google indexing and Caching works?
Through Google search you can find whatever is there in the internet. Different resources are being indexed by Google and as a result of indexing, they are being cached.
The Main point is the sensitive data that is already being cached by Google is exposed to everyone on the internet and this is what attackers/hackers are looking for. Attacker/hacker is using the Google search functionality to find out if some sensitive data of your website has been cached by Google. This means Google caching can disclose to any data whether it is sensitive or not.
Now, You already understand how google indexing and caching works.
To find out whether Google has indexed your website sensitive data or not?
You need to search/type like this in the google search.
This will ask google please return all the indexed and cached resources related to this domain.
Then you just need to see the list of the returned search results and look for all sensitive data has been cached.
To be more precise about search, you can now type for cached resources that must contain the token.
site:example.com inurl:token
This will check if some sensitive data from your web application has been cached by the Google.
Countermeasure/Fixing
What should you do to prevent Google caching from happening?
Put the following meta tag in the HTML Code of every sensitive page which you donât want to be indexed and cached.
This will give an instruction to Google that please donât index this resource.
Through this, the information disclosure by the Google Caching will no longer be possible.
Hope you like it!!!. Please share and comment.