diff --git a/docs/scraping.md b/docs/scraping.md
new file mode 100644
index 0000000..0a5320f
--- /dev/null
+++ b/docs/scraping.md
@@ -0,0 +1,81 @@
+<https://en.wikipedia.org/wiki/List_of_sovereign_states>
+
+## get links of countries from table of sovereign states
+
+Xpath to select data from table
+
+Country:
+
+```
+//table[contains(@class, 'sortable') and contains(@class, 'wikitable')]/tbody/tr[not(contains(@style, 'background'))]/td[1 and contains(@style, 'vertical-align:top;')]/b/a
+```
+
+## scrapy
+
+Pagination guide:
+<https://thepythonscrapyplaybook.com/scrapy-pagination-guide/>
+
+Response:
+<https://docs.scrapy.org/en/latest/topics/request-response.html#scrapy.http.Response>
+
+Using selectors:
+<https://docs.scrapy.org/en/latest/topics/selectors.html?highlight=xpath#using-selectors>
+
+Download files/images:
+<https://docs.scrapy.org/en/latest/topics/media-pipeline.html>
+
+### new project
+
+```
+scrapy startproject wikipedia_country_scraper
+```
+
+### create spider
+
+```
+scrapy genspider countrydownloader https://en.wikipedia.org/wiki/List_of_sovereign_states
+```
+
+### using scrapy shell
+
+- Install `ipython`:
+```
+poetry add ipython
+```
+
+- Add to `scrapy.cfg` under `[settings]`:
+```
+shell = ipython
+```
+
+- Run scrapy shell:
+```
+scrapy shell
+```
+
+- Fetch an URL:
+```
+fetch("https://en.wikipedia.org/wiki/List_of_sovereign_states")
+```
+
+- Print the response:
+```
+response
+```
+
+- Extract data using xpath:
+```
+countries = response.xpath("//table[contains(@class, 'sortable') and contains(@class, 'wikitable')]/tbody/tr[not(contains(@style, 'background'))]/td[1 and contains(@style, 'vertical-align:top;')]/b/a/@href")
+countries[0]
+```
+
+- Extract the data:
+```
+countries[0].get()
+```
+
+### Spider
+
+[start_requests](https://docs.scrapy.org/en/latest/topics/spiders.html#scrapy.Spider.start_requests) generates a [Request](https://docs.scrapy.org/en/latest/_modules/scrapy/http/request.html#Request) for each url in `start_urls`.
+
+By default if no callback is specified in a response, `parse()` is called.