Commit 33b1cccb authored by numeroteca's avatar numeroteca

add elections 2019 example to Readme

parent e81688c6
......@@ -6,7 +6,7 @@ R script to count how many titles per hour that have certain words in a bunch of
![Porcentaje de noticias en portada sobre el escándalo de Cifuentes](http://numeroteca.org/wp-content/uploads/2018/06/porcentaje-noticias-portada-diarios-digitales-cifuentes-b.jpg "Porcentaje de noticias en portada sobre el escándalo de Cifuentes")
View example: [El escándalo del TFM de Cifuentes en las páginas de inicio](http://numeroteca.org/2018/06/08/escandalo-tfm-cifuentes-paginas-inicio-periodicos-digitales/) (6/2018. numeroteca.org).
View example: [El escándalo del TFM de Cifuentes en las páginas de inicio](http://numeroteca.org/2018/06/08/escandalo-tfm-cifuentes-paginas-inicio-periodicos-digitales/) (6/2018, numeroteca.org) o [la cobertura de los partidos en periodo electoral](http://numeroteca.org/2019/06/24/cobertura-de-partidos-en-paginas-de-inicio-en-elecciones-generales-28a/) (6/2019, numeroteca.org).
# How to use Homepagex
......@@ -16,15 +16,15 @@ Ask @numeroteca / http://numeroteca.org.
## Create the list of newspaper home pages
Run this where you have your downloaded files stored.
Run this where you have your downloaded files stored in the command line (bash).
`for f in *.gz; do echo "$f" >> mylist.txt; done`
The `mylist.txt` has all the names of the .gz files that contain the html of the home pages.
The `mylist.txt` created file has all the names of the .gz files that contain the html of the home pages.
## Create data frame with all the newspaper names time and date
Based on `mylist.txt` and using `html-parser.R` creates a `results.Rda` file with:
Based on `mylist.txt` and using `html-parser.R` create a `results.Rda` file with:
+ number of titles in home page
+ number of selected titles in home page that have certain selected words
......@@ -40,7 +40,7 @@ A series of visualizations to view the results obtained.
## Where are home pages html coming from?
We are using Storytracker (http://storytracker.pastpages.org/en/latest/) to store a list of newspaper home page every hour.
We are using Storytracker script (http://storytracker.pastpages.org/en/latest/) to store in our own server a list of newspaper home page every hour. We only save the html of the page.
## Which newspapers are you storing?
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment