@AnAmericanPotato

AnAmericanPotato@programming.dev · 1 day ago

According to the Programme for the International Assessment of Adult Competencies, 2013, the median score for the US was “level 2”. 3.9% scored below level 1, and 4.2% were “non-starters”, unable to complete the questionnaire.

For context, here is the difference between level 2 and level 3, from https://en.wikipedia.org/wiki/Programme_for_the_International_Assessment_of_Adult_Competencies#Competence_groups :

Level 2: (226 points) can integrate two or more pieces of information based on criteria, compare and contrast or reason about information and make low-level inferences

Level 3: (276 points) can understand and respond appropriately to dense or lengthy texts, including continuous, non-continuous, mixed, or multiple pages.

AnAmericanPotato@programming.dev · 6 days ago

99.999% would be fantastic.

90% is not good enough to be a primary feature that discourages inspection (like a naive chatbot).

What we have now is like…I dunno, anywhere from <1% to maybe 80% depending on your use case and definition of accuracy, I guess?

I haven’t used Samsung’s stuff specifically. Some web search engines do cite their sources, and I find that to be a nice little time-saver. With the prevalence of SEO spam, most results have like one meaningful sentence buried in 10 paragraphs of nonsense. When the AI can effectively extract that tiny morsel of information, it’s great.

Ideally, I don’t ever want to hear an AI’s opinion, and I don’t ever want information that’s baked into the model from training. I want it to process text with an awareness of complex grammar, syntax, and vocabulary. That’s what LLMs are actually good at.