Problems with PRONOM

The problem is that PRONOM does not allow a user to know what a file is down to the bit-level, which is important because in future the software on which one current depends will not be available.

The PRONOM web page states that it is “a resource for anyone requiring impartial and definitive information about the file formats, software products and other technical components required to support long-term access to electronic records and other digital objects of cultural, historical or business value.

DROID is “an automatic file format identification tool providing categories of format identification for unknown files in a digital collection.”

However, the information provided for the file is rather limited.

For example, searching for a “.csv” file PRONOM provides the following information at the time of writing.

PRONOM CVS

For example, when selecting “Character Encoding” one sees:

PRONOM-CSV-characterencoding.png

One can see that this certainly does not go down to the bit level. Not knowing these details will mean that information will be lost.

Even looking at the information about ASCII one sees, at the time of writing, the following:

PRONOM-ASCII.png

There are no links to the definition, as shown above, which is ASCII-7, as opposed to ASCII-8.

Looking instead at PRONOM for "plain text" one sees

PRONOM-plain-text.png

No character encoding is provided, meaning that it is impossible to really preserve the information. For example, as a little experimentation will show, a CSV file encoded in ASCII7 gives very different results from one encoded un UTF16 when opened by, for example, MS Excel.

More generally, given the text file one can only extract the characters if one knows the encoding. Currently one normally relies on one's software to make a best guess as to the encoding, but this can be wrong. On the other hand, given the encoding one can extract the characters without software.
I Attachment Action Size DateSorted ascending Who Comment
PronomCSV.pngpng PronomCSV.png manage 126 K 05 Apr 2025 - 06:13 DavidGiaretta PRONOM CVS
PRONOM-SCV-characterencoding.pngpng PRONOM-SCV-characterencoding.png manage 45 K 05 Apr 2025 - 06:20 DavidGiaretta PRONOM CSV character encoding
PRONOM-ASCII.pngpng PRONOM-ASCII.png manage 42 K 05 Apr 2025 - 06:21 DavidGiaretta PRONOM ASCII
PRONOM-plain-text.pngpng PRONOM-plain-text.png manage 128 K 05 Apr 2025 - 06:22 DavidGiaretta PRONOM plain text
Topic revision: r2 - 05 Apr 2025, DavidGiaretta
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding OAISCommunity? Send feedback