I’m sketching out an idea for a readability assessment program. It will report the education level required to comfortably read a body of text using formulas, Dale-Chall being the most significant, that count length of sentences, what level of vocab a word is considered to be, etc. I was inspired by the word counter website I always paste my essays into. When it’s done, I would like to plug it into APIs for it to be used on Lemmy, Mastodon, and Discord.

  • arbitrary
    link
    fedilink
    English
    arrow-up
    2
    ·
    1 year ago

    With that small of a dataset imo either option is fine. If it were me I would use an ORM + sqlite just to start, in case I ever needed to migrate to a “real” database.

    • rufuyunOP
      link
      fedilink
      arrow-up
      1
      ·
      edit-2
      1 year ago

      Thank you!

      ORM + sqlite

      I am writing in C (the CLI, which I’ll just have the bots use) and have never used any databases, would using the sqlite interface straightup with C and some cursory reading of docs be too much, do you think? Course I can switch it all to c++ and then there appears to be at least one nice ORM

      • arbitrary
        link
        fedilink
        English
        arrow-up
        2
        ·
        1 year ago

        I think if you’re storing vocabulary etc, using the C interface for sqlite wouldn’t be too unwieldy and would be a good learning experience if you haven’t done much raw SQL query writing of your own. Even when you use an ORM there are often times you need to write your own queries for more complicated situations.

        One other suggestion: once you have the CLI and bots working, you could abstract this even more. Have a service process that communicates in some way (IPCC, a network port, etc.) that does the actual text analysis. Your cli and bots can then just interface over that channel. This gives separation of duties so you can easily implement new clients/servers or rework them much more easily.