add instructions

2025-03-31 15:05:18 -06:00 · 2025-03-31 15:05:18 -06:00 · a9465dda6e
commit a9465dda6e
parent add6f00ed6
1 changed files with 29 additions and 0 deletions
--- a/README.md
+++ b/README.md
@ -2,6 +2,35 @@
 Crawls sites saving all the found links to a surrealdb database. It then proceeds to take batches of 100 uncrawled links untill the crawl budget is reached. It saves the data of each site in a minio database.
 ## How to use
 1. Clone the repo and `cd` into it.
 2. Build the repo with `cargo build -r`
 3. Start the docker conatiners
 	1. cd into the docker folder `cd docker`
 	2. Bring up the docker containers `docker compose up -d`
 4. From the project's root, edit the `Crawler.toml` file to your liking.
 5. Run with `./target/release/internet_mapper`
 You can view stats of the project at `http://<your-ip>:3000/dashboards`
 ```bash
 # Untested script but probably works
 git clone https://git.oliveratkinson.net/Oliver/internet_mapper.git
 cd internet_mapper
 cargo build -r
 cd docker
 docker compose up -d
 cd ..
 $EDITOR Crawler.toml
 ./target/release/internet_mapper
 ```
 ### TODO
 - [x] Domain filtering - prevent the crawler from going on alternate versions of wikipedia.