Documentation for the Goobi plugin LayoutWizzard released

Surely you have already heard about our developments in the context of page recognition in some events of the last months and have already had the opportunity to take a look at the LayoutWizard interface. Since we could observe an increasing interest in this development during the last months, we took the opportunity to create a detailed documentation of this still quite new Goobi-Pluign and publish it here.

In case you didn’t know anything about LayoutWizzard yet, here’s a short summary of what it’s actually used for:

Generous scanning including black borders

In our opinion, master digital copies should be scanned with a generous frame around the individual pages. In this way, one always gets digital copies that contain everything relevant in any case and nothing was accidentally removed due to wrong frame setting or automatic cropping.

Automatic Image Analysis

An automatic image analysis of the LayoutWizard checks all digitized images and determines the actual page based on this and straightens the page – based on the analysis of the entire page, not just the printed text on it. The position of the book fold is then determined depending on the information as to whether the page is a left or right page.

Visual inspection and correction of analysis results

Following the automatic image analysis, a Goobi user is given the option of checking the determined values in the layout wizard. In the event of a serious deviation from the average of adjacent pages, or because the analysis was not certain of the correctness of some recognitions, the user has the option of intervening manually. Thus, a user confirms or corrects the analysis results before they are applied. Since this check is carried out separately in a list display of several left and right pages, even such a check is extremely efficient.

Automatic cropping of images as own derivative

Finally, i.e. after a user has confirmed or corrected the analysis results, this information is used for the actual process of processing. As in the case of analysis, this process takes place within the TaskManager as an independent plugin. The determined and confirmed values of the analysis are now used to cut out the actual page from the existing master digitized data (including generous frames) and save it as an independent derivative in a different directory.

Gecropptes Endergebnis neben den Masterbildern

The end result for the Goobi user is therefore both the original master folder with the generously scanned digitised images, including the black frame, book fold, etc. as well as a separate folder with the cropped individual pages, on which pages that have just been moved have no black borders and also no part of the opposite page beyond the book fold.

With the derivative, which LayoutWizzard has created in addition to the master digitalization, it is now possible to work much better in the subsequent steps. In addition to the smaller file size, the corrected alignment and the lower toner consumption in the case of printing, these images can also be used for significantly better text recognition (OCR) and the generated e-book, e.g. as an Epub file, achieves a higher quality. Last but not least: In addition to the cropped version of the images, your long-term archive also contains the generously scanned master digital copy. Safe is safe

Goobi Plugin LayoutWizzard: The LayoutWizzard is anchored in Goobi's workflow with a total of three work steps, two automatic and one user-operated.
Goobi Plugin LayoutWizzard: A user accepts a task as usual and then enters the LayoutWizzard plugin with its own user interface.
Goobi Plugin LayoutWizzard: After entering the LayoutWizzard embedded in Goobi, the user already has some settings available. Usually, however, you switch to the preview view immediately.
Goobi Plugin LayoutWizzard: In a preview list you can directly see how the images will look after cropping. The position of the book fold can be changed directly:
Goobi Plugin LayoutWizzard: Every correction process can be influenced very granular. For example, it is possible to specify for individual or all images exactly how the pages are to be straightened.
Goobi Plugin LayoutWizzard: The computationally intensive processes of image analysis and saving take place in the TaskManager.
Goobi Plugin LayoutWizzard: The master digitized files are unchanged in Goobi and have the black borders.
Goobi Plugin LayoutWizzard: The cropped and straightened images no longer have black borders.

If you would like to learn more about LayoutWizzard, just have a look at its documentation. This can be found at the following address:

http://files.intranda.com/8zjxe1h9735lbox7g1f1