custom_engine #1

Merged
Oliver merged 16 commits from custom_engine into main 2024-11-13 00:53:31 +00:00

16 Commits

Author SHA1 Message Date
d9d4c56142 fits with compose.yml 2024-11-12 17:51:13 -07:00
f87a43c3a9 added demo site for testing elements 2024-11-12 17:50:59 -07:00
7cac880f8e remove un-used function 2024-11-12 17:50:28 -07:00
720adaa552 added support for nearly all html tags that can have a link 2024-11-12 17:50:06 -07:00
7c32600694 this is really all that's needed 2024-11-12 17:49:45 -07:00
399510c599 use reqwest client for epic speedup 2024-11-10 20:37:00 -07:00
ec66c4e765 remove unused import 2024-11-10 20:36:39 -07:00
a9628ee5e4 working, now onto speeding it up 2024-11-10 20:24:04 -07:00
5404d5c3e8 it works :party: 2024-11-09 23:30:57 -07:00
fd971bafbf it works now 2024-11-09 15:28:10 -07:00
c3997b0bb7 works more, but still not all the way 2024-11-09 11:30:32 -07:00
Oliver Atkinson
7826c4cec6 jank-ish fix but it sure does work
make the root record (for links https://example.com/) have a record id of the url, thus preventing duplication when using upsert
2024-10-31 15:32:37 -06:00
Oliver Atkinson
3a46dd937b updates 2024-10-31 15:09:48 -06:00
Oliver Atkinson
fbca067b1f clean up walk() 2024-10-31 14:10:14 -06:00
Oliver Atkinson
9324160e74 crawling 🕷️ 2024-10-07 11:14:56 -06:00
Oliver Atkinson
974bccc457 no longer using spider, just wiritng my own crawler 2024-10-04 13:52:34 -06:00