Does lemmy have any communities dedicated to archiving/hoarding data?

  • Gerowen@lemmy.world
    link
    fedilink
    English
    arrow-up
    4
    ·
    edit-2
    53 minutes ago

    Neither are that bad honestly. I have jigdo scripts I run with every point release of Debian and have a copy of English Wikipedia on a Kiwix mirror I also host. Wikipedia is a tad over 100 GB. The source, arm64 and amd64 complete repos (DVD images) for Debian Trixie, including the network installer and a couple live boot images, are 353 GB.

    Kiwix has copies of a LOT of stuff, including Wikipedia on their website. You can view their zim files with a desktop application or host your own web version. Their website is: https://kiwix.org/

    If you want (or if Wikipedia is censored for you) you can also look at my mirror to see what a web hosted version looks like: https://kiwix.marcusadams.me/

    Note: I use Anubis to help block scrapers. You should have no issues as a human other than you may see a little anime girl for a second on first load, but every once and a while Brave has a disagreement with her and a page won’t load correctly. I’ve only seen it in Brave, and only rarely, but I’ve seen it once or twice so thought I’d mention it.

  • pyrflie@lemmy.dbzer0.com
    link
    fedilink
    arrow-up
    24
    ·
    edit-2
    3 hours ago

    Welcome to datahoarders.

    We’ve been here for decades.

    Also follow 3-2-1 people. 3 Backups, 2 storage mediums, 1 offsite.

  • utopiah@lemmy.world
    link
    fedilink
    arrow-up
    14
    ·
    4 hours ago

    FWIW :

    fabien@debian2080ti:/media/fabien/slowdisk$ ls -lhS offline_prep/
    total 341G
    -rw-r--r-- 1 fabien fabien 103G Jul  6  2024 wikipedia_en_all_maxi_2024-01.zim
    -rw-r--r-- 1 fabien fabien  81G Apr 22  2023 gutenberg_mul_all_2023-04.zim
    -rw-r--r-- 1 fabien fabien  75G Jul  7  2024 stackoverflow.com_en_all_2023-11.zim
    -rw-r--r-- 1 fabien fabien  74G Mar 10  2024 planet-240304.osm.pbf
    -rw-r--r-- 1 fabien fabien 3.8G Oct 18 06:55 debian-13.1.0-amd64-DVD-1.iso
    -rw-r--r-- 1 fabien fabien 2.6G May  7  2023 ifixit_en_all_2023-04.zim
    -rw-r--r-- 1 fabien fabien 1.6G May  7  2023 developer.mozilla.org_en_all_2023-02.zim
    -rw-r--r-- 1 fabien fabien 931M May  7  2023 diy.stackexchange.com_en_all_2023-03.zim
    -rw-r--r-- 1 fabien fabien 808M Jun  5  2023 wikivoyage_en_all_maxi_2023-05.zim
    -rw-r--r-- 1 fabien fabien 296M Apr 30  2023 raspberrypi.stackexchange.com_en_all_2022-11.zim
    -rw-r--r-- 1 fabien fabien 131M May  7  2023 rapsberry_pi_docs_2023-01.zim
    -rw-r--r-- 1 fabien fabien 100M May  7  2023 100r-off-the-grid_en_2022-06.zim
    -rw-r--r-- 1 fabien fabien  61M May  7  2023 quantumcomputing.stackexchange.com_en_all_2022-11.zim
    -rw-r--r-- 1 fabien fabien  45M May  7  2023 computergraphics.stackexchange.com_en_all_2022-11.zim
    -rw-r--r-- 1 fabien fabien  37M May  7  2023 wordnet_en_all_2023-04.zim
    -rw-r--r-- 1 fabien fabien  23M Jul 17  2023 kiwix-tools_linux-armv6-3.5.0-1.tar.gz
    -rw-r--r-- 1 fabien fabien  16M Oct  6 21:32 be-stib-gtfs.zip
    -rw-r--r-- 1 fabien fabien 3.8M Oct  6 21:32 be-sncb-gtfs.zip
    -rw-r--r-- 1 fabien fabien 2.3M May  7  2023 termux_en_all_maxi_2022-12.zim
    -rw-r--r-- 1 fabien fabien 1.9M May  7  2023 kiwix-firefox_3.8.0.xpi
    
    

    but if you want the easier version just get Kiwix on whatever device in front of you right now (yes, even mobile phone assuming you have the space) then get whatever content you need.

    If need a bit of help I recorded TechSovereignty at home, episode 11 - Offline Wikipedia, Kiwix and checksums with a friend just 3 weeks ago.

    I also wrote randomly update https://fabien.benetou.fr/Content/Vademecum and coded https://git.benetou.fr/utopiah/offline-octopus but tbh KDE-Connect is much better now.

    The point though is having such a repository takes minutes. If you don’t have the space, buy a 512Go microSD for 50EUR then put that on, stuff it in a drawer then move on. If you want to every 3 months or whenever you feel like it, updated it.

    TL;DR: takes longer to write such a meme than actually do it.

  • mazzilius_marsti@lemmy.world
    link
    fedilink
    arrow-up
    12
    ·
    7 hours ago

    we need all repos to be stored offline, and documentations to troubleshoot.

    the 1st i have no idea how much space we will need. Most linux packages are prerry light, no? But there is A LOT of them…

    the 2nd is easy. Heard someone say the entire of wikipedia is 200GB, should be doable. Dont forget the technical wikis too: Debian, Gentoo, Arch.

    • skisnow@lemmy.ca
      link
      fedilink
      English
      arrow-up
      4
      ·
      4 hours ago

      Can’t remember who it was (b3ta? popbitch? penny-arcade?), but I recently saw a comment by someone who’s been running a website since the turn of the millennium, and they said that fully 99% of the links they posted two decades ago were no longer valid.

      To really put that into perspective, you have to remember that for most sites to get linked to from a popular site like that, meant that it was usually something of value that would have had a lot of work put into it, and that people found interesting or useful.

    • clif@lemmy.world
      link
      fedilink
      arrow-up
      22
      ·
      edit-2
      11 hours ago

      Last time I updated it was closer to 120GB but if you’re not sweating 100 GB then an extra 20 isn’t going to bother anyone these days.

      Also, thanks for reminding me that I need to check my dates and update.

      EDIT: you can also easily configure a SBC like a Raspberry Pi (or any of the clones) that will boot, set the Wi-Fi to access point mode, and serve kiwix as a website that anyone (on the local AP wifi network) can connect to and query… And it’ll run off a USB battery pack. I have one kicking around the house somewhere

      • techwithjake@sh.itjust.works
        link
        fedilink
        arrow-up
        10
        ·
        10 hours ago

        Just built one of those using Dietpi as the OS and NVME M.2 for the storage. I have many different ZIMs and running different services and only using about 270GB.

        Works great for offline use. Probably should add an ISO or 2 as well.

    • mistermodal@lemmy.ml
      link
      fedilink
      arrow-up
      25
      ·
      14 hours ago

      Yeah also if you make a Zim wiki or convert a website into Zim then you can run that stuff too. If you use Emacs it’s easy to convert some pages to wikitext for Zim too

    • Gigasser@lemmy.world
      link
      fedilink
      arrow-up
      1
      ·
      9 hours ago

      I wonder if there’s anyways to edit these files afterwards? They tend to be read only, right? I must confess, I don’t have too much experience with this myself.

      • Prathas@lemmy.zip
        link
        fedilink
        arrow-up
        2
        ·
        4 hours ago

        It’s probably hundreds of thousands of HTML files, no? What is the fear about being able to edit or not?

  • Pumpkin Escobar@lemmy.world
    link
    fedilink
    English
    arrow-up
    38
    ·
    11 hours ago

    I stumbled across this sort of fascinating area of doomsday prepping a few weeks back.

    https://prepperpress.com/usb/

    A nice addition to that, don’t just make it a USB, but a raspberry pi. So you’d have a reasonably low-powered computer you could easily take with you.

    Not suggesting this one as it seems a bit expensive to me, but https://www.prepperdisk.com/products/prepper-disk-premium-over-512gb-of-survival-content?view=sl-8978CA41

      • boonhet@sopuli.xyz
        link
        fedilink
        arrow-up
        2
        ·
        4 hours ago

        You’d first have to buy a phone that can run postmarketos and these are much rarer than I wish they were. Is there even anything new that can run it? Pine64 stopped making phones and said they’ll make a new one when they can make it RISC-V.

        Fairphone maybe I guess. 4 is listed as a supported device, but someone has gotten it working on 6 too.

      • techwithjake@sh.itjust.works
        link
        fedilink
        arrow-up
        1
        ·
        4 hours ago

        Cause if ya wanna go overboard like I did, 1TB of NVME storage, can add with SD Card if necessary. 16GB RAM. Very little learning curve for my part as I use SBCs often. Plus almost every Docker container and program I want works on RPi without any hassle.

        There’s also more robust guides and community for RPi.

        Just my thoughts.

    • techwithjake@sh.itjust.works
      link
      fedilink
      arrow-up
      13
      ·
      edit-2
      10 hours ago

      Just built one of these myself. I went NVME M.2 instead of SD Card to avoid data corruption. I know SD Cards are fine if you don’t write to them a lot but if you wanna update or add your own stuff, scares me. Plus NVME is just so much faster.

        • techwithjake@sh.itjust.works
          link
          fedilink
          arrow-up
          3
          ·
          4 hours ago

          Pretty much what Sinthesis said; USB power brick and/or solar panels. Both at the ready and tested. Also got a big ass battery backup that will charge off solar panels.

        • Bob Robertson IX @discuss.tchncs.de
          link
          fedilink
          English
          arrow-up
          8
          ·
          6 hours ago

          You find a generator, or solar panels, or wind mill, or water turbine, or a bicycle hooked up to a generator.

          If electricity permanently goes out then we’re in a scavenger situation and it is time to start taking apart things that are no longer necessary to build the things that are.

        • Sinthesis@lemmy.today
          link
          fedilink
          English
          arrow-up
          5
          ·
          6 hours ago

          You only need 20 watts of power. One of those dinky fold up solar panels would work. Add a USB power brick for cloudy days.

  • juipeltje@lemmy.world
    link
    fedilink
    arrow-up
    26
    ·
    11 hours ago

    Yeah not gonna lie, i think i heard someone in a youtube video a while back talk about how the entirety of wikipedia takes up like 200 gigs or something like that, and it got me seriously considering to actually make that offline backup. Shit is scary when countries like the uk are basically blocking you from having easy access to knowledge.

    • mic_check_one_two@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      9
      ·
      11 hours ago

      Yeah, it’s surprisingly small when it’s compressed if you exclude things like images and media. It’s just text, after all. But the high level of compression requires special software to actually read without uncompressing the entire archive. There are dedicated devices you can get, which pretty much only do that. Like there are literal Wikipedia readers, where you just give it an archive file and it’ll allow you to search for and read articles.

          • mfed1122@discuss.tchncs.de
            link
            fedilink
            English
            arrow-up
            3
            ·
            4 hours ago

            If my experience with mashing the random article button is any indicator, you could reduce the size by 30% just by removing articles on sports players. I doubt I’ll need those

    • palordrolap@fedia.io
      link
      fedilink
      arrow-up
      6
      ·
      11 hours ago

      UKGOV haven’t started on things like Wikipedia yet. They know kids use it for school and blinded by ideology though they are, even they can see there’d be an enormous backlash if they blocked it any time soon.

      If that’s going to happen at all, I doubt it would be before the next election. That’s whether Labour get re-elected or the Tories make an unexpected comeback. You can tell how far Labour have fallen in the eyes of their party faithful when they’ve taken a Tory-drafted policy and made it their own.

      Ironically, the up and coming third option fascist party, have said they’re going to repeal the Online Safety Act. They have other fish to fry if they get in, and they’ll want to keep their preferred demographic(s) happy while they do it.

      I assume that eventually something like the OSA would come back to “protect the children”. They love the current US President.

      None of this is hopeful. Take this as more of a rant.