The Analyzed Layout and Text Object (ALTO) is an open XML standard originally developed by the EU project METAe (2004), and maintained by the Library of Congress since 2010. The goal is to represent information of OCR recognized texts, i.e. to describe text (strings) and layout information (coordinates of columns, lines and words on a page).
Links: Library of Congress, Veridian