How do you manage your (financial) documents?

Mobius · September 26, 2021, 8:57pm

FWIW in my personal case I favor a self-hosted approach and scan paper documents as they come in using the CamScanner Pro Android app, share the resulting file to the Syncthing app which uploads it to a specific folder on my NAS, which will then leverage the OCRmyPDF Python script to automatically OCR the PDF and move it to a year/month-based directory. I am then able to use the Ambar software to do full-text search on any document I have on my NAS. The quick scanning/OCR procedure makes it a no-brainer to scan the documents, and having full-text search makes it very easy to find documents when I need them (in a question of seconds). The software stack on the NAS is configured using docker-compose.

Ed_Waadt · September 27, 2021, 7:52am

Thank you!

Now I have a new project thanks to you

SwissDan · September 27, 2021, 12:37pm

If you’re into self-hosting solutions, you might be interested in https://github.com/jonaswinkler/paperless-ng

OCR included. I don’t use it, but heard a lot of good about it.

Mobius · September 27, 2021, 9:29pm

That was one of the many solutions I tried, but never managed to get it to work properly. These were my requirements, by the way:

Self-hosted;
Quick to import and to search PDFs;
Use tesseract OCR as that is the gold-standard of open-source OCR for me;
No state stored that cannot be rebuilt from the PDF themselves (I don’t want to own a separate datastore). Some solutions import PDFs and then you lose sight of them and everything must be done within the tool itself. No thanks.

Polo · October 3, 2021, 3:39pm

I scan all documents and destroy the papers, DL the ebill bills etc.
Sort them in different folders (assurances, prévoyance, ménage, impôts etc.) on a cloud (I do not use the internal drive of the laptop so that they stay available from every device when not at home).
Every time I do a bit of admin work (1-2 times a month approx), I duplicate the folders on an other cloud and on a external harddrive.

Mobius · October 13, 2021, 9:46pm

Coincidently, my Ambar server recently started having issues with expired SSL certificates and consuming some CPU every 5 minutes, so I started looking for a replacement. I have now replaced it with pagerless-ng (a much better maintained fork of the original paperless project) and I’m much happier with it, for the following reasons:

it uses much less resources when idle;
it still leverages the great OCRmyPDF/Tesseract projects;
it has AI technology to automatically recognize sender/document date/type of document;
it still maintains a tree of PDFs in your disk so you don’t lose access to them if the project stops being maintained for some reason;
there’s a ton of community support, with apps for Android/iOS and even command line interface(!)

In short, hopefully you haven’t yet gone down the Ambar route, since this will be more future-proof solution. Hope this helps!

SwissTeslaBull · October 14, 2021, 6:14am

Simple: i have a “Finance” share on my Synology nas with nightly encrypted backup to the Synology Cloud.

Ed_Waadt · October 16, 2021, 2:36pm

Thanks for the heads-up! Fortunately no, I was on holidays and I just go back at my PC to geek, so i’ll give this NG fork a test.

belouga13 · October 19, 2021, 7:16am

Hi @betterlatethannever
very good topic! I handle archiving for both work and my personal life (obviously). I apply the 10 year role for my private stuff as well (if it is in paper form I file in a binder, if not I archive everything on a NAS with backup)
However recently I came across this startup, based in Lausanne: Addmin - your intelligent digital filing cabinet
Haven’t tried it out (yet), but might be of interest to anyone if managing/filing documents is not your cup of cake

San_Francisco · October 22, 2021, 8:15pm

Are these hosted / servers solutions worth it for an individual user?

Can’t you just use your file manager’s built-in folders, tags and search functionality (Finder tags, Spotlight search)? Much less complexity and installation required, less likely to break with a software update or malfunction…?

Burningstone · October 27, 2021, 2:45pm

Thank you for sharing this. Just gave paperless-ng a try and I have to say I’m really impressed by the OCR capabilities.

I’ll start now to digitalize all my documents and finally get rid of paper as much as possible.

MrCheese · October 29, 2021, 2:30pm

Thanks for the paperless-ng recommendation, was rather easy to spin up the docker containers on my QNAP.

bamboo · October 30, 2021, 3:42pm

$ ocrmypdf -l fra MyInputFile MyOutputFile

InputFile and OutputFile can have different or same name.
The -l option defines the language of the document to optimize the ocr work.