If the term paywall does not tell you anything, read this article. The continued rise of adblocking apps has put an end to the revenue model based on the ads being displayed on the web pages.
So we have seen many news sites use different ways to make money. Thus large websites such as: The Wall Street Journal, Financial Times, The New York Times or The Washington Post they use paywalls to increase their incomes.
There are different types of paywalls but they all have one thing in common: they block access to the content of the web pages, either immediately or after reading a certain number of articles.
Immediately afterwards, a paywall window appears to the visitors requesting them to subscribe to the site to continue reading.
This practice may seem meaningful from a business perspective, and may be much more lucrative than Google ads blocked by adblockers, but there is a serious disadvantage.
Websites lose a high percentage of their visitors using a paywall system.
It's no secret that news sites allow access to all of their content to search engines. If you look at Google for example News, you will notice articles from websites that use paywalls.
Are you wondering how sites can block or allow access to content?
There are many different controls, including its control user-agent, the referrer, i.e. from which page you come and of course the cookies. All of the above can be leveraged to determine the legitimacy of access.
How can we overcome the obstacles?
Perhaps the best way is to disguise yourself at Googlebot. Let's see how it works:
Referrer: https://www.google.com/
User-Agent: Mozilla / 5.0 (compatible; Googlebot / 2.1; http: //www.google.com/bot.html
In Firefox
Firefox users need two add-ons to help them bypass paywalls: the first is RefControl, which can change the referral page when you visit news websites, and the second one called User Agent Switcher, will change the user-agent of the browser.
Download and install both extensions in Firefox.
Press the Alt key and select Tools - Options - RefControl.
Click "add webpage", and type in a domain, select the custom action, and enter https://www.google.com/ as a reference.
Repeat for each news page you want to access
When finished, close the configuration window.
Press the Alt key again, and select Tools> User Agent Default> Agents User Editor from the menu. Name it Googlebot.
Exit the menu.
Before you try to access these sites, press Alt and select Default User Agent - Googlebot
- The above add-ons will not work on all sites
For Google Chrome
Google Chrome users can also install extensions User Agent Switcher The estate provides stunning sea views and offers a unique blend of luxury living and development potential Referer Control.
There is however another feature, and this is to create a custom extension that automates the process in the browser.
Follow the instructions. All you need is to create a new directory on your computer. Below we describe how you can create two files: background.js and manifest.json.
background.js
var ALLOW_COOKIES = ["nytimes", "ft.com"] function changeRefer (details) {foundReferer = false; foundUA = false var reqHeaders = details.requestHeaders.filter (function (header) {// block cookies by default if (header.name! == "Cookie") {return header;} allowHeader = ALLOW_COOKIES.map (function (url) {if (details.url.includes (url)) {return true;} return false;}); if (allowHeader.reduce (function (a, b) {return a || b}, false)) {return header; (}). map (function (header) {if (header.name === "Referer") {header.value = "https://www.google.com/"; foundReferer = true;} if (header. name === "User-Agent") {header.value = "Mozilla / 5.0 (compatible; Googlebot / 2.1; + http: //www.google.com/bot.html)"; foundUA = true;} return header ;}) // append referer if (! foundReferer) q reqHeaders.push ({"name": "Referer", "value": "https://www.google.com/"})} if (! foundUA) q reqHeaders.push ({"name": "User-Agent", "value": "Mozilla / 5.0 (compatible; Googlebot / 2.1; + http: //www.google.com/bot.html)"})} console.log (reqHeaders); return {requestHeaders: reqHeaders}; } function blockCookies (details) {for (var i = 0; i <details.responseHeaders.length; ++ i) {if (details.responseHeaders [i] .name === "Set-Cookie") {details.responseHeaders .splice (i, 1); }} return {responseHeaders: details.responseHeaders}; } chrome.webRequest.onBeforeSendHeaders.addListener (changeRefer, {urls: [""], types: ["main_frame"],}, ["requestHeaders", "blocking"]); chrome.webRequest.onHeadersReceived.addListener (blockCookies, {urls: [""], types: ["main_frame"],}, ["responseHeaders", "blocking"];
manifest.json
{"name": "Innocuous Chrome Extension", "version": "0.1", "description": "This is an innocuous chrome extension.", "permissions": ["webRequest", "webRequestBlocking", "http: / /www.ft.com/* "," http://www.wsj.com/* "," https://www.wsj.com/* "," http://www.economist.com/* "," http://www.nytimes.com/* "," https://hbr.org/* "," http://www.newyorker.com/* "," http: //www.forbes .com / * "," http://online.barrons.com/* "," http://www.barrons.com/* "," http://www.investingdaily.com/* "," http : //realmoney.thestreet.com/* "," http://www.washingtonpost.com/* "]," background ": {" scripts ": [" background.js "]}," manifest_version ": 2 }
After creating the files (use NotePad ++ to create them and change the extension to .js), place them in a folder.
If you do not have the time you can download the files folder from the following link:
You need to enable "developer mode" from the address
chrome: // extensions /
and then from the "load unpacked extension" select the folder with the two files you created.
Upload the extension to Chrome.
You can modify the files and add new web pages you want to bypass. Just open the files with Notepad ++ and read the code.
Anti Paywall js Elaine