Recently I was on the hunt for a cross platform bookmarking solution. My criteria for the tool was:
- Cross platform -- can bookmark sites from chrome, safari on desktop, and safari on ios.
- Can bookmark quickly, ideally in 1-2 clicks.
- Can be reasonably confident in data integrity and privacy.
- A way to search bookmarks.
- Ability to append notes or additional text, something more than just a title.
- Nice to have: saves backup copies of the pages bookmarked and allow searching by content.
I tried a variety of services, and most were lacking in some piece of criteria while providing many features I did not need. Eventually, I saw a hacker news article referencing YaCy as a self hosted version of historio.us -- which I liked the simplicity of -- so I decided to give it a try. It met all of my criteria outside of the ability to append notes. Since it snapshots web pages and provides full text search, that hasn't proved to be an issue for how I use it.
YaCy Setup on Debian/Ubuntu #
Assuming you have podman/docker installed on the machine you wish to run YaCy, setup is a breeze.
- Install for intel cpu with podman (used on debian). This will download the image from docker.io and start up yacy on port 8090.
podman run -d --name yacy_search_server -p 8090:8090 -p 8443:8443 -v yacy_search_server_data:/opt/yacy_search_server/DATA --restart unless-stopped --log-opt max-size=200m --log-opt max-file=2 docker.io/yacy/yacy_search_server:latest
- Open up yacy and modify the following options to configure YaCy to act as a bookmarking service. We don't intend for it to act as a P2P search engine. If credentials are asked for, the default ones are username: "admin" password: "yacy".
- Use Case & Accounts -> Basic Configuration -> 2. set to 'Search portal for your own web pages'.
- Use Case & Accounts -> Network Configuration -> Ensure it's set to 'Robinson Mode' and select 'Private Peer' and save that section.
- Change default username and password.
- Setup bookmarklet that will be used to capture the current webpage to yacy when pressed. Search "depth" is set to 0 so that only the intended page is bookmarked and YaCy doesn't follow any urls on the page. Set the line below as the bookmark url.
javascript: (() => { window.open(`http://localiphere:8090/Crawler_p.html?crawlingDomMaxPages=10000&range=wide&intention=&sitemapURL=&crawlingQ=on&crawlingMode=url&crawlingURL=${encodeURIComponent(window.location.href)}&crawlingFile=&mustnotmatch=&crawlingFile%24file=&crawlingstart=Neuen Crawl starten&mustmatch=.*&createBookmark=on&bookmarkFolder=/crawlStart&xsstopw=on&indexMedia=on&crawlingIfOlderUnit=hour&cachePolicy=iffresh&indexText=on&crawlingIfOlderCheck=on&bookmarkTitle=&crawlingDomFilterDepth=1&crawlingDomFilterCheck=on&crawlingIfOlderNumber=1&crawlingDepth=0`, "_blank"); })()
This bookmark is just one to open up YaCy showing all results ordered by date captured.
localiphere:8090/yacysearch.html?query=*+/date&maximumRecords=10&resource=local&verify=ifexist&prefermaskfilter=&cat=href&constraint=&contentdom=text&strictContentDom=false&meanCount=5&former=a&startRecord=0
- If you're running YaCy on a headless server and it doesn't seem accessible from the outside or the podman container seems to stop itself, ensure enable-linger is turned on to keep the container running.
sudo loginctl enable-linger $user
Backing up and restoring data #
Once you have YaCy up and running, at some point you may want to backup or transfer your bookmarks. Here's how to do that.
Backup Data #
These commands will backup yacy data into /home/yacy-backups directory (ensure directory is already created).
podman stop yacy_search_server
podman run --rm -v yacy_search_server_data:/opt/yacy_search_server/DATA -v /home/chris/yacy-backups:/tmp:z docker.io/openjdk:8-stretch bash -c "cd /opt/yacy_search_server && tar -cf - DATA | xz -q -3v -T0 > /tmp/YACYDATA-$(date +\"%Y-%m-%d\").tar.xz"
podman start yacy_search_server
Restore data #
These are the commands to restore data into YaCy from a backup. I believe this will also restore any settings in addition to bookmarks.
This assumes there is a file with the backup data called YACYDATA.tar.xz
and is present in the target directory (in this example, /home/chris/yacy-backups/
)
podman stop yacy_search_server
podman run --rm -v yacy_search_server_data:/opt/yacy_search_server/DATA -v /home/chris/yacy-backups:/tmp:z docker.io/openjdk:8-stretch bash -c "cd /opt/yacy_search_server && rm -rf DATA/* && tar xf /tmp/YACYDATA.tar.xz"
podman start yacy_search_server