I found the Scribus project and I am now completing the book with it. It took a little time to learn the functions, but I'm really happy with it. The PDF output is great, and I found all the layout adjustments that I needed. Thanks, Scribus!
One tricky problem I encountered was in sharing the book content for review. I was not able to find an easy way to export the written content as text so I could paste it into a word processing file. Scribus is free, but the people who are reviewing this book are not likely to enjoy it as much as I do. I tried copying the text from the text frames and pasting it. This was tedious and the copied text did not include any line breaks. So I had to hunt through the pasted text and insert line breaks between each paragraph.
Since I am likely to submit more versions of the book for review, I thought it would be useful to have an automated way to extract the text. Scribus uses XML source files, so I wrote XSLT to transform them to plain text.
Here's the XSLT stylesheet that will write the text of a Scribus document to a plain text file. It handles my document, which is not very complex. One challenging aspect of Scribus source XML is that character strings are held in elements that are siblings of the paragraph markers. So the export XSLT has to include logic for recognizing the paragraph structures. I prefer explicit structures in XML documents, like wrapping contents inside a <para> element, but I guess the Scribus team had their reasons for keeping things flat.
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:exsl="http://exslt.org/common" extension-element-prefixes="exsl"> <!-- #!/bin/bash # Example bash script for transforming a Scribus file. # This requires an XSLT 2.0 compatible XSLT processor,
# in this case Saxon v9. SAXON_JAR_PATH="/path/to/saxon/download/saxon9he.jar" INPUT_FILE="my-scribus-file.sla" OUTPUT_FILE="my-output-text-file.txt" XSLT_FILE="this-xslt-file.xslt" java -classpath $SAXON_JAR_PATH \ net.sf.saxon.Transform \ -o ${OUTPUT_FILE} \ ${INPUT_FILE} \ ${XSLT_FILE} --> <xsl:output method="text" /> <!-- Start by explicitly selecting the root of the DOM, and then applying templates to PAGEOBJECT elements. Sort the selected elements by the OwnPage attributes, which indicate the document order. Then sort the selected elements by the YPOS attibutes, which indicate the order on the page, roughly, and assuming you're
reading from top to bottom. Pick a different attribute for
the secondary sort if you prefer. --> <xsl:template match="/"> <xsl:apply-templates select="//PAGEOBJECT"> <xsl:sort select="@OwnPage" data-type="number" /> <xsl:sort select="@YPOS" data-type="number" /> </xsl:apply-templates> </xsl:template> <!-- Working with Scribus XML is tricky because the para
elements are siblings of the ITEXT elements that hold text
strings. I would have expected nested elements. But I guess
there's a Scribus-related reason. --> <xsl:template match="ITEXT"> <xsl:value-of select="@CH" /> </xsl:template> <!-- Write a tab character if the ITEXT is followed by a
tab element. --> <xsl:template match="ITEXT[name(following-sibling::*[1])='tab']"> <xsl:value-of select="@CH" /> <xsl:text>	</xsl:text> </xsl:template> <!-- The newline character in the text element creates a line break between the paragraphs of your Scribus document. I believe
the trail element is equivalent to the end of a paragraph. --> <xsl:template match="ITEXT[name(following-sibling::*[1])=('para', 'trail')]"> <xsl:value-of select="@CH" /> <xsl:text>
</xsl:text> </xsl:template> <!-- I don't want lots of newlines and space so text nodes must be supressed. --> <xsl:template match="text()" /> </xsl:stylesheet>