foray-commit Mailing List for FOray
Modular XSL-FO Implementation for Java.
Status: Alpha
Brought to you by:
victormote
You can subscribe to this list here.
| 2006 |
Jan
|
Feb
|
Mar
(139) |
Apr
(98) |
May
(250) |
Jun
(394) |
Jul
(84) |
Aug
(13) |
Sep
(420) |
Oct
(186) |
Nov
(1) |
Dec
(3) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2007 |
Jan
(108) |
Feb
(202) |
Mar
(291) |
Apr
(247) |
May
(374) |
Jun
(227) |
Jul
(231) |
Aug
(60) |
Sep
(31) |
Oct
(45) |
Nov
(18) |
Dec
|
| 2008 |
Jan
(38) |
Feb
(71) |
Mar
(142) |
Apr
|
May
(59) |
Jun
(6) |
Jul
(10) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2009 |
Jan
(12) |
Feb
(4) |
Mar
(88) |
Apr
(121) |
May
(17) |
Jun
(30) |
Jul
|
Aug
(5) |
Sep
|
Oct
(1) |
Nov
|
Dec
|
| 2010 |
Jan
(11) |
Feb
(76) |
Mar
(11) |
Apr
|
May
(11) |
Jun
|
Jul
|
Aug
(44) |
Sep
(14) |
Oct
(7) |
Nov
|
Dec
|
| 2011 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(9) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(10) |
Nov
|
Dec
|
| 2012 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(3) |
Jul
(4) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2016 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(168) |
| 2017 |
Jan
(77) |
Feb
(11) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2018 |
Jan
|
Feb
|
Mar
(1) |
Apr
(6) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2019 |
Jan
|
Feb
(88) |
Mar
(118) |
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2020 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(6) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(141) |
| 2021 |
Jan
(170) |
Feb
(20) |
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
(62) |
Nov
(189) |
Dec
(162) |
| 2022 |
Jan
(201) |
Feb
(118) |
Mar
(8) |
Apr
|
May
(2) |
Jun
(47) |
Jul
(19) |
Aug
(14) |
Sep
(3) |
Oct
|
Nov
(28) |
Dec
(235) |
| 2023 |
Jan
(112) |
Feb
(23) |
Mar
(2) |
Apr
(2) |
May
|
Jun
(1) |
Jul
|
Aug
(70) |
Sep
(92) |
Oct
(20) |
Nov
(1) |
Dec
(1) |
| 2024 |
Jan
|
Feb
|
Mar
(1) |
Apr
(1) |
May
(14) |
Jun
(11) |
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2025 |
Jan
(10) |
Feb
(29) |
Mar
|
Apr
(162) |
May
(245) |
Jun
(83) |
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
| S | M | T | W | T | F | S |
|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
1
(4) |
|
2
|
3
|
4
|
5
|
6
(2) |
7
|
8
|
|
9
|
10
(5) |
11
(2) |
12
|
13
(2) |
14
|
15
|
|
16
(2) |
17
(1) |
18
|
19
|
20
|
21
|
22
|
|
23
|
24
|
25
(1) |
26
|
27
|
28
(1) |
29
(3) |
|
30
(8) |
|
|
|
|
|
|
|
From: <vic...@us...> - 2007-09-30 16:28:43
|
Revision: 10226
http://foray.svn.sourceforge.net/foray/?rev=10226&view=rev
Author: victormote
Date: 2007-09-30 09:28:43 -0700 (Sun, 30 Sep 2007)
Log Message:
-----------
Update release documentation.
Modified Paths:
--------------
trunk/foray/doc/web/dev/admin/release.html
Modified: trunk/foray/doc/web/dev/admin/release.html
===================================================================
--- trunk/foray/doc/web/dev/admin/release.html 2007-09-30 16:15:45 UTC (rev 10225)
+++ trunk/foray/doc/web/dev/admin/release.html 2007-09-30 16:28:43 UTC (rev 10226)
@@ -28,13 +28,14 @@
release.</p>
<h2><a name="before"/>Before</h2>
<ol>
+ <li>Make sure all tests pass.</li>
<li>Make sure that org.foray.core.Application.getVersion() has the correct
version number.</li>
<li>Make sure that the build script build-common has the correct value for the
property "version".</li>
<li>Update the release notes. Commit, but don't publish yet.</li>
<li>Lock the repository or at least make a note of the repository revision
- number which should be tagged.</li>
+ number which should be tagged: _________________.</li>
</ol>
<h2><a name="build"/>Build</h2>
@@ -57,14 +58,22 @@
<h2><a name="after"/>After</h2>
<ol>
- <li>Tag the repository.</li>
+ <li>Tag the repository, by copying the trunk revision to tags/rel_[revision],
+where [revision] is the id of the release being created, periods being
+converted to underscores. For example, for the 1.0 release, the trunk revision
+should be copied to "tags/rel_1_0". Enter the name of the tag here:
+_________________________.</li>
<li>Unlock the repository, if it was locked above.</li>
<li>Upload distribution file to sourceforge.</li>
<li>Add link to release notes on sourceforge.</li>
+ <li>Download releases and make sure downloaded files are identical to uploaded
+files.</li>
<li>Publish website to get the new release notes updated.</li>
<li>Announce the release at:
<ol>
- <li>The foray-dev and foray-announce mailing lists.</li>
+ <li>The foray-announce (foray-announce at lists.sourceforge.net),
+foray-developer (foray-developer at lists.sourceforge.net), and foray-support
+(foray-support at lists.sourceforge.net) mailing lists.</li>
<li>www.w3c.org</li>
<li>www.xslfo.info</li>
</ol>
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <vic...@us...> - 2007-09-30 16:15:42
|
Revision: 10225
http://foray.svn.sourceforge.net/foray/?rev=10225&view=rev
Author: victormote
Date: 2007-09-30 09:15:45 -0700 (Sun, 30 Sep 2007)
Log Message:
-----------
Upgrade version numbers for development.
Modified Paths:
--------------
trunk/foray/foray-core/src/java/org/foray/core/Application.java
trunk/foray/scripts/build-common.xml
Modified: trunk/foray/foray-core/src/java/org/foray/core/Application.java
===================================================================
--- trunk/foray/foray-core/src/java/org/foray/core/Application.java 2007-09-30 14:21:41 UTC (rev 10224)
+++ trunk/foray/foray-core/src/java/org/foray/core/Application.java 2007-09-30 16:15:45 UTC (rev 10225)
@@ -64,7 +64,7 @@
* @return The current version of the application.
*/
public static String getVersion() {
- return "0.3";
+ return "0.4-dev";
}
/**
Modified: trunk/foray/scripts/build-common.xml
===================================================================
--- trunk/foray/scripts/build-common.xml 2007-09-30 14:21:41 UTC (rev 10224)
+++ trunk/foray/scripts/build-common.xml 2007-09-30 16:15:45 UTC (rev 10225)
@@ -38,7 +38,7 @@
<property name="name.lowercase" value="foray"/>
<property name="contact.info"
value="The FOray project http://www.foray.org"/>
- <property name="version" value="0.3"/>
+ <property name="version" value="0.4-dev"/>
<property name="debug" value="on"/>
<property name="optimize" value="off"/>
<property name="deprecation" value="off"/>
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <vic...@us...> - 2007-09-30 14:21:45
|
Revision: 10224
http://foray.svn.sourceforge.net/foray/?rev=10224&view=rev
Author: victormote
Date: 2007-09-30 07:21:41 -0700 (Sun, 30 Sep 2007)
Log Message:
-----------
Fix erroneous link.
Modified Paths:
--------------
trunk/foray/doc/web/app/using/release.html
Modified: trunk/foray/doc/web/app/using/release.html
===================================================================
--- trunk/foray/doc/web/app/using/release.html 2007-09-30 14:18:11 UTC (rev 10223)
+++ trunk/foray/doc/web/app/using/release.html 2007-09-30 14:21:41 UTC (rev 10224)
@@ -23,7 +23,7 @@
<ul>
<li><a href="/00-release/notes-unreleased.html">Unreleased Changes</a></li>
- <li><a href="/00-release/notes-0_002.html">Release Notes for FOray 0.3</a>,
+ <li><a href="/00-release/notes-0_003.html">Release Notes for FOray 0.3</a>,
September 30, 2007.</li>
<li><a href="/00-release/notes-0_002.html">Release Notes for FOray 0.2</a>,
September 30, 2006.</li>
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <vic...@us...> - 2007-09-30 14:18:07
|
Revision: 10223
http://foray.svn.sourceforge.net/foray/?rev=10223&view=rev
Author: victormote
Date: 2007-09-30 07:18:11 -0700 (Sun, 30 Sep 2007)
Log Message:
-----------
Fix bad link.
Modified Paths:
--------------
trunk/foray/doc/web/app/using/download.html
Modified: trunk/foray/doc/web/app/using/download.html
===================================================================
--- trunk/foray/doc/web/app/using/download.html 2007-09-30 14:15:59 UTC (rev 10222)
+++ trunk/foray/doc/web/app/using/download.html 2007-09-30 14:18:11 UTC (rev 10223)
@@ -21,7 +21,7 @@
<p class="Warning">The current release of FOray (0.3) is not a mature product
and may not be suitable for your needs.
-Please see the <a href="/00-release/notes-0_003">Release Notes</a> for more
+Please see the <a href="/00-release/notes-0_003.html">Release Notes</a> for more
information.</p>
<p>The FOray application can be downloaded from <a rel="external"
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <vic...@us...> - 2007-09-30 14:15:55
|
Revision: 10222
http://foray.svn.sourceforge.net/foray/?rev=10222&view=rev
Author: victormote
Date: 2007-09-30 07:15:59 -0700 (Sun, 30 Sep 2007)
Log Message:
-----------
Get doc caught up to release.
Modified Paths:
--------------
trunk/foray/doc/web/00-release/notes-0_002.html
trunk/foray/doc/web/00-release/notes-0_003.html
trunk/foray/doc/web/app/using/download.html
Modified: trunk/foray/doc/web/00-release/notes-0_002.html
===================================================================
--- trunk/foray/doc/web/00-release/notes-0_002.html 2007-09-30 12:48:29 UTC (rev 10221)
+++ trunk/foray/doc/web/00-release/notes-0_002.html 2007-09-30 14:15:59 UTC (rev 10222)
@@ -18,7 +18,7 @@
<h1>FOray: Release Notes, Release 0.2</h1>
<h3>Release 0.2 Changes of Interest to Users</h3>
-<p class="warning">FOray 0.2 is primarily for developers and module users.
+<p class="Warning">FOray 0.2 is primarily for developers and module users.
Although many parts of FOray work well, and are usable as modules, the
application as a whole has some general problems that require attention.</p>
Modified: trunk/foray/doc/web/00-release/notes-0_003.html
===================================================================
--- trunk/foray/doc/web/00-release/notes-0_003.html 2007-09-30 12:48:29 UTC (rev 10221)
+++ trunk/foray/doc/web/00-release/notes-0_003.html 2007-09-30 14:15:59 UTC (rev 10222)
@@ -18,7 +18,18 @@
<h1>FOray: Release Notes, Release 0.3</h1>
<h3>Release 0.3 Changes of Interest to Users</h3>
+<p class="Warning">FOray 0.3 is primarily for developers and module users.
+Although many parts of FOray work well, and are usable as modules, the
+application as a whole has some general problems that require attention.</p>
+
+<p>Here are the more significant of the known issues:</p>
<ul>
+ <li>Footnotes are not laid out in the document.</li>
+ <li>There are many general layout problems.</li>
+</ul>
+
+<p>Other changes:</p>
+<ul>
<li>FOray is now dependent on Java 5.0 or higher.
Previous releases were depending on Java 1.4 or higher.</li>
<li>FOray no longer requires JAI libraries to be installed for PNG and TIFF
Modified: trunk/foray/doc/web/app/using/download.html
===================================================================
--- trunk/foray/doc/web/app/using/download.html 2007-09-30 12:48:29 UTC (rev 10221)
+++ trunk/foray/doc/web/app/using/download.html 2007-09-30 14:15:59 UTC (rev 10222)
@@ -19,9 +19,9 @@
<h2><a name="intro">Introduction</a></h2>
-<p class="Warning">The current release of FOray (0.2) is not a mature product
+<p class="Warning">The current release of FOray (0.3) is not a mature product
and may not be suitable for your needs.
-Please see the <a href="release.html#0_2">Release Notes</a> for more
+Please see the <a href="/00-release/notes-0_003">Release Notes</a> for more
information.</p>
<p>The FOray application can be downloaded from <a rel="external"
@@ -47,7 +47,7 @@
basic usability of the download very easily:</p>
<ol>
<li>Unzip the distribution.</li>
- <li>Make sure your "JAVE_HOME" environment variable is set to a Java 1.4 or
+ <li>Make sure your "JAVE_HOME" environment variable is set to a Java 5 or
higher runtime.</li>
<li>Change directory to the "scripts" directory in the distribution and run
either "foray-test.bat" (for Windows) or "foray-test.sh" (For Unix and Linux).
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <vic...@us...> - 2007-09-30 12:48:25
|
Revision: 10221
http://foray.svn.sourceforge.net/foray/?rev=10221&view=rev
Author: victormote
Date: 2007-09-30 05:48:29 -0700 (Sun, 30 Sep 2007)
Log Message:
-----------
made a copy
Added Paths:
-----------
tags/rel_0_3/
Copied: tags/rel_0_3 (from rev 10220, trunk)
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <vic...@us...> - 2007-09-30 12:13:41
|
Revision: 10220
http://foray.svn.sourceforge.net/foray/?rev=10220&view=rev
Author: victormote
Date: 2007-09-30 05:13:35 -0700 (Sun, 30 Sep 2007)
Log Message:
-----------
Update version numbers in preparation for a release.
Modified Paths:
--------------
trunk/foray/foray-core/src/java/org/foray/core/Application.java
trunk/foray/scripts/build-common.xml
trunk/foray/scripts/dist-test.sh
Modified: trunk/foray/foray-core/src/java/org/foray/core/Application.java
===================================================================
--- trunk/foray/foray-core/src/java/org/foray/core/Application.java 2007-09-30 11:59:56 UTC (rev 10219)
+++ trunk/foray/foray-core/src/java/org/foray/core/Application.java 2007-09-30 12:13:35 UTC (rev 10220)
@@ -64,7 +64,7 @@
* @return The current version of the application.
*/
public static String getVersion() {
- return "0.3-dev";
+ return "0.3";
}
/**
Modified: trunk/foray/scripts/build-common.xml
===================================================================
--- trunk/foray/scripts/build-common.xml 2007-09-30 11:59:56 UTC (rev 10219)
+++ trunk/foray/scripts/build-common.xml 2007-09-30 12:13:35 UTC (rev 10220)
@@ -38,7 +38,7 @@
<property name="name.lowercase" value="foray"/>
<property name="contact.info"
value="The FOray project http://www.foray.org"/>
- <property name="version" value="0.3-dev"/>
+ <property name="version" value="0.3"/>
<property name="debug" value="on"/>
<property name="optimize" value="off"/>
<property name="deprecation" value="off"/>
Modified: trunk/foray/scripts/dist-test.sh
===================================================================
--- trunk/foray/scripts/dist-test.sh 2007-09-30 11:59:56 UTC (rev 10219)
+++ trunk/foray/scripts/dist-test.sh 2007-09-30 12:13:35 UTC (rev 10220)
@@ -5,7 +5,7 @@
# Modify the following variable to the appropriate value before running this
# script.
-FORAY_VERSION="0.2"
+FORAY_VERSION="0.3"
TEMP_DIR=$HOME/tmp
UNZIP_DIR=$TEMP_DIR/unzip
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <vic...@us...> - 2007-09-30 11:59:54
|
Revision: 10219
http://foray.svn.sourceforge.net/foray/?rev=10219&view=rev
Author: victormote
Date: 2007-09-30 04:59:56 -0700 (Sun, 30 Sep 2007)
Log Message:
-----------
Update release notes.
Modified Paths:
--------------
trunk/foray/doc/web/00-release/notes-0_001.html
trunk/foray/doc/web/00-release/notes-0_002.html
trunk/foray/doc/web/00-release/notes-unreleased.html
trunk/foray/doc/web/00-rsrc/include/leftmenu-dev.html
trunk/foray/doc/web/app/using/release.html
trunk/foray/doc/web/dev/admin/release.html
Added Paths:
-----------
trunk/foray/doc/web/00-release/notes-0_003.html
Modified: trunk/foray/doc/web/00-release/notes-0_001.html
===================================================================
--- trunk/foray/doc/web/00-release/notes-0_001.html 2007-09-29 23:32:09 UTC (rev 10218)
+++ trunk/foray/doc/web/00-release/notes-0_001.html 2007-09-30 11:59:56 UTC (rev 10219)
@@ -22,7 +22,7 @@
<p>Release 0.1 is totally oriented toward modularization of the system, and
toward improvements to the font system.</p>
-<h3>Changes from FOP 0.20.5 to FOray 0.1 of interest to Users</h3>
+<h3>Changes from FOP 0.20.5 to FOray 0.1 of Interest to Users</h3>
<p>Current work is deliberately focused on architecture, not features. However,
the following improvements have resulted as side-effects of the architectural
changes:</p>
@@ -48,7 +48,7 @@
No other changes have been made to the way fonts are configured.</li>
</ul>
-<h3>Changes from FOP 0.20.5 to FOray 0.1 of interest to Developers</h3>
+<h3>Changes from FOP 0.20.5 to FOray 0.1 of Interest to Developers</h3>
<ul>
<li>Font-related classes have been extracted into a separate module.</li>
<li>Graphic and PDF-related classes have also been extracted into separate
Modified: trunk/foray/doc/web/00-release/notes-0_002.html
===================================================================
--- trunk/foray/doc/web/00-release/notes-0_002.html 2007-09-29 23:32:09 UTC (rev 10218)
+++ trunk/foray/doc/web/00-release/notes-0_002.html 2007-09-30 11:59:56 UTC (rev 10219)
@@ -17,7 +17,7 @@
<h1>FOray: Release Notes, Release 0.2</h1>
-<h3>Release 0.2 changes of interest to Users</h3>
+<h3>Release 0.2 Changes of Interest to Users</h3>
<p class="warning">FOray 0.2 is primarily for developers and module users.
Although many parts of FOray work well, and are usable as modules, the
application as a whole has some general problems that require attention.</p>
@@ -311,7 +311,7 @@
for more information.</li>
</ul>
-<h3>Release 0.2 changes of interest to Developers</h3>
+<h3>Release 0.2 Changes of Interest to Developers</h3>
<h4>Modularity Changes</h4>
<p>As of Release 0.2, all FOray modules are independent of each other, using
Added: trunk/foray/doc/web/00-release/notes-0_003.html
===================================================================
--- trunk/foray/doc/web/00-release/notes-0_003.html (rev 0)
+++ trunk/foray/doc/web/00-release/notes-0_003.html 2007-09-30 11:59:56 UTC (rev 10219)
@@ -0,0 +1,90 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE html
+ PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
+ "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
+
+<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
+
+<head>
+ <title>FOray: Release Notes, Release 0.3</title>
+ <meta name="content-revised"
+ content="$Date$"/>
+ <!--#include virtual="/00-rsrc/include/standard-head.html" -->
+</head>
+
+<body>
+<!--#include virtual="/00-rsrc/include/leftmenu.html" -->
+
+<h1>FOray: Release Notes, Release 0.3</h1>
+
+<h3>Release 0.3 Changes of Interest to Users</h3>
+<ul>
+ <li>FOray is now dependent on Java 5.0 or higher.
+ Previous releases were depending on Java 1.4 or higher.</li>
+ <li>FOray no longer requires JAI libraries to be installed for PNG and TIFF
+ support. These services are now provided natively through open-source
+ libraries packaged with FOray.</li>
+ <li>Tables now work much better (not perfectly).</li>
+ <li>Lists now work much better (not perfectly).</li>
+ <li>Added support for the axsl:metadata extension. See
+ <a href="/app/features/extensions.html#axsl-extensions">aXSL
+ Extensions</a>.</li>
+ <li>Added parsing and validation support for the objects and properties added
+ in XSL-FO 1.1.
+ These are not necessarily used by other FOray modules, but their existence
+ should not cause processing errors.
+ (The bookmark-related objects are fully supported).</li>
+ <li>Significant improvements to expressions in property values.</li>
+ <li>FOray now has limited support for embedding EPS (encapsulated PostScript)
+ files in PDF output. See <a href="../features/graphics.html#eps">EPS
+ Graphics</a> for more details.</li>
+ <li>Support has been discontinued for the "continued-label" object in the
+ "foray" namespace. This general capability has been replaced by new features
+ in XSL-FO 1.1, which will be implemented in FOray as user needs are expressed
+ and as developer resources are available.</li>
+</ul>
+
+<h3>Release 0.3 Changes of Interest to Developers</h3>
+<ul>
+ <li>Please review the list of
+ <a href="http://www.axsl.org/rel-notes.html">unreleased changes to aXSL</a>.
+ FOray maintains its API to conform to aXSL's, even between releases.</li>
+ <li>The class org.foray.common.StringUtilPre5 has been removed, and uses of
+ its methods have been replaced by standard Java 5.0 String and Character
+ methods.</li>
+ <li>Completion of javadoc API documentation for the FOrayGraphic,
+ FOrayHyphen-R, FOrayFOTree, FOrayOutput, FOrayLayout, FOrayText, FOrayMIF,
+ FOrayPDF, FOrayArea, FOrayRender, FOrayApp, and FOrayCore modules.
+ All FOray modules now have comprehensive javadocs.</li>
+ <li>The FOrayGraphicServer constructor no longer requires the name of the
+ parser class as a parameter.</li>
+ <li>JUnit tests have been introduced in most modules. Our testing is far
+ from comprehensive, but some good infrastructure is now in place which is
+ already assisting in maintaining stability.</li>
+ <li>Hyphenation pattern files are now named using the 3-character ISO-639
+ language codes instead of the 2-character codes previously used. This change
+ is transparent to the user (they can still use either the 2- or 3-character
+ code in input). This change was made to accommodate languages for which no
+ 2-character code was assigned.</li>
+ <li>FOTree properties have been overhauled a bit.
+ Each property now has its own class, with abstract superclasses where
+ appropriate.
+ This makes storage of the property type redundant, and it has been removed,
+ which should reduce the memory footprint.
+ Also the PropertyList no longer keeps a reference to the parent FObj.</li>
+ <li>The Apache Commons Logging dependency has been upgraded to version
+ 1.1.</li>
+ <li>The Batik libraries (for SVG processing) have been upgraded to 1.6.</li>
+ <li>Batik dependencies have been eliminated from the Common, FOTree, AreaTree,
+ Pioneer Layout, PDF and Renderer modules. All Batik dependencies are now
+ confined to the Graphic package, and are thus effectively hidden behind the
+ aXSL Graphic interfaces. This makes it much more feasible for users to drop in
+ another (perhaps commercial) SVG package. It also makes maintenance of the
+ Batik integration much more straightforward.</li>
+ <li>The Apache XML Graphics Commons dependency has been upgraded to
+ version 1.2.</li>
+</ul>
+
+<!--#include virtual="/00-rsrc/include/leftmenu-end.html" -->
+</body>
+</html>
Property changes on: trunk/foray/doc/web/00-release/notes-0_003.html
___________________________________________________________________
Name: svn:keywords
+ "Author Id Rev Date URL"
Name: svn:eol-style
+ native
Modified: trunk/foray/doc/web/00-release/notes-unreleased.html
===================================================================
--- trunk/foray/doc/web/00-release/notes-unreleased.html 2007-09-29 23:32:09 UTC (rev 10218)
+++ trunk/foray/doc/web/00-release/notes-unreleased.html 2007-09-30 11:59:56 UTC (rev 10219)
@@ -22,29 +22,7 @@
<h3>Unreleased changes of interest to Users</h3>
<ul>
- <li>FOray is now dependent on Java 5.0 or higher.
- Previous releases were depending on Java 1.4 or higher.</li>
- <li>FOray no longer requires JAI libraries to be installed for PNG and TIFF
- support. These services are now provided natively through open-source
- libraries packaged with FOray.</li>
- <li>Tables now work much better (not perfectly).</li>
- <li>Lists now work much better (not perfectly).</li>
- <li>Added support for the axsl:metadata extension. See
- <a href="/app/features/extensions.html#axsl-extensions">aXSL
- Extensions</a>.</li>
- <li>Added parsing and validation support for the objects and properties added
- in XSL-FO 1.1.
- These are not necessarily used by other FOray modules, but their existence
- should not cause processing errors.
- (The bookmark-related objects are fully supported).</li>
- <li>Significant improvements to expressions in property values.</li>
- <li>FOray now has limited support for embedding EPS (encapsulated PostScript)
- files in PDF output. See <a href="../features/graphics.html#eps">EPS
- Graphics</a> for more details.</li>
- <li>Support has been discontinued for the "continued-label" object in the
- "foray" namespace. This general capability has been replaced by new features
- in XSL-FO 1.1, which will be implemented in FOray as user needs are expressed
- and as developer resources are available.</li>
+ <li>There are currently no such unreleased changes.</li>
</ul>
<h3>Unreleased changes of interest to Developers</h3>
@@ -52,40 +30,6 @@
<li>Please review the list of
<a href="http://www.axsl.org/rel-notes.html">unreleased changes to aXSL</a>.
FOray maintains its API to conform to aXSL's, even between releases.</li>
- <li>The class org.foray.common.StringUtilPre5 has been removed, and uses of
- its methods have been replaced by standard Java 5.0 String and Character
- methods.</li>
- <li>Completion of javadoc API documentation for the FOrayGraphic,
- FOrayHyphen-R, FOrayFOTree, FOrayOutput, FOrayLayout, FOrayText, FOrayMIF,
- FOrayPDF, FOrayArea, FOrayRender, FOrayApp, and FOrayCore modules.
- All FOray modules now have comprehensive javadocs.</li>
- <li>The FOrayGraphicServer constructor no longer requires the name of the
- parser class as a parameter.</li>
- <li>JUnit tests have been introduced in most modules. Our testing is far
- from comprehensive, but some good infrastructure is now in place which is
- already assisting in maintaining stability.</li>
- <li>Hyphenation pattern files are now named using the 3-character ISO-639
- language codes instead of the 2-character codes previously used. This change
- is transparent to the user (they can still use either the 2- or 3-character
- code in input). This change was made to accommodate languages for which no
- 2-character code was assigned.</li>
- <li>FOTree properties have been overhauled a bit.
- Each property now has its own class, with abstract superclasses where
- appropriate.
- This makes storage of the property type redundant, and it has been removed,
- which should reduce the memory footprint.
- Also the PropertyList no longer keeps a reference to the parent FObj.</li>
- <li>The Apache Commons Logging dependency has been upgraded to version
- 1.1.</li>
- <li>The Batik libraries (for SVG processing) have been upgraded to 1.6.</li>
- <li>Batik dependencies have been eliminated from the Common, FOTree, AreaTree,
- Pioneer Layout, PDF and Renderer modules. All Batik dependencies are now
- confined to the Graphic package, and are thus effectively hidden behind the
- aXSL Graphic interfaces. This makes it much more feasible for users to drop in
- another (perhaps commercial) SVG package. It also makes maintenance of the
- Batik integration much more straightforward.</li>
- <li>The Apache XML Graphics Commons dependency has been upgraded to
- version 1.2.</li>
</ul>
<!--#include virtual="/00-rsrc/include/leftmenu-end.html" -->
Modified: trunk/foray/doc/web/00-rsrc/include/leftmenu-dev.html
===================================================================
--- trunk/foray/doc/web/00-rsrc/include/leftmenu-dev.html 2007-09-29 23:32:09 UTC (rev 10218)
+++ trunk/foray/doc/web/00-rsrc/include/leftmenu-dev.html 2007-09-30 11:59:56 UTC (rev 10219)
@@ -154,7 +154,7 @@
<tr>
<td class="Bullet1"> </td>
<td class="Menu1">
- <a class="Menu" href="/dev/admin/release.html">Release</a>
+ <a class="Menu" href="/dev/admin/release.html">Release Mechanics</a>
</td>
</tr>
</table>
Modified: trunk/foray/doc/web/app/using/release.html
===================================================================
--- trunk/foray/doc/web/app/using/release.html 2007-09-29 23:32:09 UTC (rev 10218)
+++ trunk/foray/doc/web/app/using/release.html 2007-09-30 11:59:56 UTC (rev 10219)
@@ -23,6 +23,8 @@
<ul>
<li><a href="/00-release/notes-unreleased.html">Unreleased Changes</a></li>
+ <li><a href="/00-release/notes-0_002.html">Release Notes for FOray 0.3</a>,
+September 30, 2007.</li>
<li><a href="/00-release/notes-0_002.html">Release Notes for FOray 0.2</a>,
September 30, 2006.</li>
<li><a href="/00-release/notes-0_001.html">Release Notes for FOray 0.1</a>,
Modified: trunk/foray/doc/web/dev/admin/release.html
===================================================================
--- trunk/foray/doc/web/dev/admin/release.html 2007-09-29 23:32:09 UTC (rev 10218)
+++ trunk/foray/doc/web/dev/admin/release.html 2007-09-30 11:59:56 UTC (rev 10219)
@@ -32,7 +32,7 @@
version number.</li>
<li>Make sure that the build script build-common has the correct value for the
property "version".</li>
- <li>Update the release notes.</li>
+ <li>Update the release notes. Commit, but don't publish yet.</li>
<li>Lock the repository or at least make a note of the repository revision
number which should be tagged.</li>
</ol>
@@ -48,7 +48,7 @@
<li>foray-[version]-src.zip</li>
</ul>
</li>
- <li>Run the test-dist.sh shell script. This script looks for potential
+ <li>Run the dist-test.sh shell script. This script looks for potential
problems in the distribution files. If any problems are reported, fix them
before proceeding.</li>
<li>Unzip the "bin-all" distribution file, and run both the foray-test.bat
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <vic...@us...> - 2007-09-29 23:32:05
|
Revision: 10218
http://foray.svn.sourceforge.net/foray/?rev=10218&view=rev
Author: victormote
Date: 2007-09-29 16:32:09 -0700 (Sat, 29 Sep 2007)
Log Message:
-----------
Split release documentation up.
Modified Paths:
--------------
trunk/foray/doc/web/app/using/release.html
Added Paths:
-----------
trunk/foray/doc/web/00-release/
trunk/foray/doc/web/00-release/notes-0_001.html
trunk/foray/doc/web/00-release/notes-0_002.html
trunk/foray/doc/web/00-release/notes-unreleased.html
Added: trunk/foray/doc/web/00-release/notes-0_001.html
===================================================================
--- trunk/foray/doc/web/00-release/notes-0_001.html (rev 0)
+++ trunk/foray/doc/web/00-release/notes-0_001.html 2007-09-29 23:32:09 UTC (rev 10218)
@@ -0,0 +1,74 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE html
+ PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
+ "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
+
+<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
+
+<head>
+ <title>FOray: Release Notes, Release 0.1</title>
+ <meta name="content-revised"
+ content="$Date$"/>
+ <!--#include virtual="/00-rsrc/include/standard-head.html" -->
+</head>
+
+<body>
+<!--#include virtual="/00-rsrc/include/leftmenu.html" -->
+
+<h1>FOray: Release Notes, Release 0.1</h1>
+
+<p>Release 0.1 can be obtained from the source
+code repository by using the tag <code>rel_0_1_branch</code>.</p>
+<p>Release 0.1 is totally oriented toward modularization of the system, and
+toward improvements to the font system.</p>
+
+<h3>Changes from FOP 0.20.5 to FOray 0.1 of interest to Users</h3>
+<p>Current work is deliberately focused on architecture, not features. However,
+the following improvements have resulted as side-effects of the architectural
+changes:</p>
+<ul>
+ <li>Embedded CID fonts had a prefix added to their names in the PDF output
+file. This has been removed.</li>
+ <li>Fonts subsetting should now be thread-safe. Each document has its own
+list of characters used for font embedding/subsetting purposes. This needs to
+be tested in a multi-threading environment. (Also, please note that, because
+of limitations in the handling of System fonts which have yet to be resolved,
+some custom coding will probably be required to keep the FontServer running
+across multiple documents.</li>
+</ul>
+<p>In addition, the following changes should be noted:</p>
+<ul>
+ <li>All font configuration entries, which used to reside in the FOP
+configuration file, must now be located in a separate file. There is a new
+FOray configuration entry <strong>font-configuration</strong> which should be used
+to point to this
+new file.</li>
+ <li>The font configuration file should have a root element of
+<foray-font-config>. The <font> elements are children of the root.
+No other changes have been made to the way fonts are configured.</li>
+</ul>
+
+<h3>Changes from FOP 0.20.5 to FOray 0.1 of interest to Developers</h3>
+<ul>
+ <li>Font-related classes have been extracted into a separate module.</li>
+ <li>Graphic and PDF-related classes have also been extracted into separate
+modules as a side-effect of getting the Font-related classes extracted.</li>
+ <li>Subclass relationships between Font-related classes have been cleaned up
+extensively.</li>
+ <li>The relationship between Font concepts and PDF implementation
+concepts has been clarified greatly (probably needs more work).</li>
+ <li>Visibility on most Font-related classes has been reduced.</li>
+ <li>The following methods in the forayFont class called Font (called
+fonts/FontMetrics in FOP's HEAD) were returning values in millionths of a point:
+getAscender(), getDescender(), getCapHeight(), getXHeight(), and getCharWidth().
+Methods using values returned by these methods were typically dividing the
+returned values by 1000. The Font methods have been changed to return values in
+millipoints (1/1000 of a point), and the corresponding methods in fop-maint have
+been changed to <i>not </i>divide by 1000. Although this makes sense, it may be
+confusing to anyone trying to implement the forayFont module into FOP's
+HEAD.</li>
+</ul>
+
+<!--#include virtual="/00-rsrc/include/leftmenu-end.html" -->
+</body>
+</html>
Property changes on: trunk/foray/doc/web/00-release/notes-0_001.html
___________________________________________________________________
Name: svn:keywords
+ "Author Id Rev Date URL"
Name: svn:eol-style
+ native
Added: trunk/foray/doc/web/00-release/notes-0_002.html
===================================================================
--- trunk/foray/doc/web/00-release/notes-0_002.html (rev 0)
+++ trunk/foray/doc/web/00-release/notes-0_002.html 2007-09-29 23:32:09 UTC (rev 10218)
@@ -0,0 +1,383 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE html
+ PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
+ "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
+
+<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
+
+<head>
+ <title>FOray: Release Notes, Release 0.2</title>
+ <meta name="content-revised"
+ content="$Date$"/>
+ <!--#include virtual="/00-rsrc/include/standard-head.html" -->
+</head>
+
+<body>
+<!--#include virtual="/00-rsrc/include/leftmenu.html" -->
+
+<h1>FOray: Release Notes, Release 0.2</h1>
+
+<h3>Release 0.2 changes of interest to Users</h3>
+<p class="warning">FOray 0.2 is primarily for developers and module users.
+Although many parts of FOray work well, and are usable as modules, the
+application as a whole has some general problems that require attention.</p>
+
+<p>Here are the more significant of the known issues:</p>
+<ul>
+ <li>The FO Tree does not parse all expressions properly. Valid FO syntax
+ is sometimes flagged as an error.</li>
+ <li>The Area Tree does not handle tables and lists properly in most cases.
+ Specifically, these items will likely not be placed in the proper place,
+ and may not even be placed in a reasonable place.</li>
+</ul>
+
+<h4>General Configuration Changes</h4>
+<ul>
+ <li>The names of several configuration file entries have been changed for
+the sake of consistency and clarity:
+ <table>
+ <thead>
+ <tr>
+ <th>Old Name</th>
+ <th>New Name</th>
+ </tr>
+ </thead>
+ <tr><td>baseDir</td><td>base-directory</td></tr>
+ <tr><td>debugMode</td><td>debug-mode</td></tr>
+ <tr><td>dumpConfiguration</td><td>dump-configuration</td></tr>
+ <tr><td>fontBaseDir</td><td>font-base-directory</td></tr>
+ <tr><td>hyphenation-dir</td><td>hyphenation-base-directory</td></tr>
+ <tr><td>stream-filter-list</td><td>pdf-filters</td></tr>
+ <tr><td>strokeSVGText</td><td>stroke-svg-text</td></tr>
+ </table>
+ </li>
+ <li>A new command-line option "-so" (session option) has been added to allow
+ the entry of any configuration option through the command-line. See
+ <a href="configuration.html">FOray Configuration</a> for more details.</li>
+ <li>The names of several renderer options (set through command-line options
+ or through the API) have been changed for the sake of consistency and
+ clarity. In addition, some general command-line options have been replaced
+ by general configuration options. Even though the old command-line options
+ have been removed, their
+ functional equivalent can be attained by using the new "-so" command-line
+ option together with the new configuration option.
+ <table>
+ <thead>
+ <tr>
+ <th>Old Command-line Option</th>
+ <th>Old Renderer Option Name</th>
+ <th>New Configuration Key</th>
+ </tr>
+ </thead>
+ <tr>
+ <td>-o</td>
+ <td>ownerPassword</td>
+ <td>pdf-owner-password</td>
+ </tr>
+ <tr>
+ <td>-u</td>
+ <td>userPassword</td>
+ <td>pdf-user-password</td>
+ </tr>
+ <tr>
+ <td>-noprint</td>
+ <td>allowPrint</td>
+ <td>pdf-user-print</td>
+ </tr>
+ <tr>
+ <td>-nocopy</td>
+ <td>allowCopyContent</td>
+ <td>pdf-user-copy</td>
+ </tr>
+ <tr>
+ <td>-noedit</td>
+ <td>allowEditContent</td>
+ <td>pdf-user-modify</td>
+ </tr>
+ <tr>
+ <td>-noannotations</td>
+ <td>allowEditAnnotations</td>
+ <td>pdf-user-annotate</td>
+ </tr>
+ <tr>
+ <td>-txt.encoding</td>
+ <td>unknown</td>
+ <td>txt-encoding</td>
+ </tr>
+ <tr>
+ <td>-s</td>
+ <td>unknown</td>
+ <td>at-sparse</td>
+ </tr>
+ <tr>
+ <td>-q</td>
+ <td>n/a</td>
+ <td>verbosity</td>
+ </tr>
+ <tr>
+ <td>-d</td>
+ <td>n/a</td>
+ <td>verbosity</td>
+ </tr>
+ <tr>
+ <td>-x</td>
+ <td>n/a</td>
+ <td>dump-configuration (already existed)</td>
+ </tr>
+ <tr>
+ <td>-l</td>
+ <td>n/a</td>
+ <td>language</td>
+ </tr>
+ </table>
+ </li>
+ <li>Configuration options whose value is a list are now entered as
+ space-delimited Strings. Previously these were placed in the configuration
+ file as value elements inside of a list element. This change was made to
+ facilitate making these options available on the command line.</li>
+ <li>A boolean configuration option <strong>cache-graphics</strong> was
+added.
+It defaults to "false".
+If set to true, graphic objects will be cached and can be reused, resulting in
+potential speed improvement.
+If set to false, graphic objects are not reused, resulting in potential memory
+savings.</li>
+</ul>
+
+<h4>Font Changes</h4>
+<ul>
+ <li>The intermediate font-metrics xml file is no longer used.
+Instead, font resources are read on-the-fly as needed.
+This has resulted in numerous changes to the font configuration file (see
+below).</li>
+ <li>Fonts used by the system are no longer read entirely into memory, but are
+read from disk as necessary. This, along with some other improvements, should
+reduce the amount of memory needed for fonts.</li>
+ <li>The <code>embed="all"</code> attribute for the <font>
+element in the font configuration
+file is now respected for CID fonts.</li>
+ <li>PDF output for CID fonts now includes the ToUnicode CMap. This allows
+cut-and-paste, index, search, etc. to work correctly on PDF text for these
+fonts. This fixes FOP's Bugzilla entry #5335.</li>
+ <li>Italic angle information is now parsed from PFM files.</li>
+ <li>FontBBox and StemV (StdVW) information is now parsed from PFA and PFB
+files.</li>
+ <li>Adobe Font Metrics (AFM) files may now be used for font metrics.
+To use this option, set the font configuration item
+<strong>metrics-file</strong> to point
+to an AFM file. AFM files are now the preferred way to communicate font metrics.
+Note that users can easily modify AFM files to correct bad font information.
+Also AFM files can theoretically be used for bitmapped fonts.</li>
+ <li>Kerning has been fixed for subsetted fonts. Kerning now works correctly
+for all supported formats.</li>
+ <li>The command-line class PFMReader has been removed.</li>
+ <li>The command-line class TTFReader has been changed. It now takes
+either one or two arguments. If given one, and that one is a TrueType collection
+file, the list of fonts within that collection will be written. If given a
+TrueType collection file followed by the name of a font within that file, some
+basic metrics will be parsed and written. If given one argument, and that
+argument is a regular TrueType file, the basic metrics of that file will be
+parsed and written. This tool is useful primarily to see the list of fonts in
+a TTC file, and as a debugging tool if metrics do not seem to be computed
+properly.</li>
+ <li>It is now possible to actively prevent the embedding of font files
+whose permissions restrict embedding, and FOray does prevent such
+embedding.</li>
+ <li>The <strong>font-base-directory</strong> configuration option is now
+used as a base directory
+for <strong>font-file</strong> and <strong>metrics-file</strong>. It was
+formerly used as a base directory for the
+intermediate xml font metrics files. </li>
+</ul>
+
+<h4>Font Configuration Changes</h4>
+<p>The configuration of fonts has change dramatically. See
+<a href="../features/fonts.html#config">Font Configuration</a> for details on
+the changes listed.</p>
+
+<ul>
+ <li>The <strong>server</strong> element has been added.</li>
+ <li>The <strong>parameter</strong> element has been added.</li>
+ <li>The <strong>glyph-list</strong> element has been added.</li>
+ <li>The <strong>encoding</strong> element has been added.</li>
+ <li>The <strong>font-family</strong> element has been added.</li>
+ <li>The <strong>font-family-alias</strong> element has been added.</li>
+ <li>The <strong>font-triplet</strong> element has been renamed to
+<strong>font-description</strong>.
+The font-description element must now be inside a font-family element.</li>
+ <li>The <strong>name</strong> attribute of the element
+<strong>font-description</strong> (formerly <strong>font-triplet</strong>) has
+been removed.
+Instead the font-description element is placed inside a font-family element,
+which contains the name.</li>
+</ul>
+
+<p>The following have been added as attributes to the <strong>font</strong>
+element.
+These were formerly command-line options in TTFReader and PFMReader (the
+applications that built the font metrics files):</p>
+
+<ul>
+ <li><strong>font-resource</strong> (was the "-er" [embed-resource]
+command-line option).
+Font files can be included in fop.jar when building fop. Both font-resource and
+font-file can be specified for the same font, but if font-file is specified and
+can be opened, it will take precedence over font-resource. Please note that the
+companion option -ef already exists in the font configuration file as
+<strong>embed-file</strong> (now called <strong>font-file</strong>).</li>
+ <li><strong>embed-name</strong> (was the "-fn" command-line option). If
+embed-name is
+specified, it will be used as the name of the font in the output document, if
+the font is embedded in that document. This is sometimes useful when trying to
+ensure that the embedded font is used in the document instead of a system font
+with the same name. (Note that if <strong>embed-name</strong> is
+<i>not </i>specified, FOray first
+attempts to extract and embed the name from the font itself. If that fails, it
+uses the configured attribute <strong>name</strong>.</li>
+ <li><strong>ttc-name</strong> (was the "-ttcname" command-line option.
+If you are reading data from a TrueType Collection (.ttc
+file) you must specify which font from the collection you will read metrics
+from.</li>
+ <li><strong>metrics-file</strong>. This was formerly the pfm-file that
+was designated
+positionally on the command-line. It must now be explicitly identified in the
+configuration file for Type 1 fonts. Please note that this <i>is not </i> the
+same meaning that metrics-file previously had. Before FOray 0.2, the
+metrics-file attribute pointed to the (no-longer-used) intermediate xml file
+that was parsed from font metrics information.</li>
+ <li><strong>embed</strong>. This was formerly implied by the use of
+<strong>embed-file</strong> and
+<strong>embed-resource</strong>. Since those attributes have been renamed to
+clarify their use,
+you must now explicitly state what embedding behavior is desired. Valid options
+are "all", "subset", and "none". The default is "none".</li>
+</ul>
+
+<p>In addition, the following changes related to the direct parsing of font files
+have been made:</p>
+ <ul>
+ <li>The <strong>name</strong> attribute has been added for the element
+<strong>font</strong> in the font
+configuration file. This is currently not used at all, except as a fallback
+name to be embedded in an output file if no other name can be found. Its main
+purpose is for reference in writing and parsing the configuration file, but it
+may be used as a "key" for other configuration elements in the future.</li>
+ <li>The former attribute <strong>embed-file</strong> for the element
+<strong>font</strong> in the font
+configuration file, has been renamed to <strong>font-file</strong>.
+This change was made for clarity since, for some font types, the font file
+contains both the embedding information and the metrics information.</li>
+ <li>Note that the former TTFReader command-line option "-enc" has no
+analog in the configuration file. The artifical reencoding that this option
+enabled is no longer necessary now that CID fonts are properly embedded in the
+PDF output (see change below regarding ToUnicode CMap).</li>
+ </ul>
+
+<h4>Other Changes</h4>
+<ul>
+ <li>Since we have now obtained rights to the internet domain "foray.org", the
+package structure of all FOray module classes has been moved from "com.outfitr"
+to "org.foray". The package structure for fop-maint remains unchanged.</li>
+ <li>Starting with Release 0.2, FOray requires a minimum java runtime
+environment of 1.4.</li>
+ <li><em>All</em> properties should now be supported at the parse level,
+including those properties (like aural properties) that are not directly used
+by any FOray application. This also includes all shorthand properties. Please
+note that this does not mean that all properties are used or used properly by
+the layout or rendering systems.</li>
+ <li>The class for running FOray from the command-line has been renamed
+"FOray".</li>
+ <li>The outer-layer application API has changed significantly. This will
+not affect command-line users, but will affect anyone running FOray embedded
+within another application. Details of the new API can be found at <a
+href="../../module/app/index.html">The FOray Application</a>, and at the run()
+method in the class CommandLineStarter.</li>
+ <li>Support for the Jimi package has been removed from FOrayGraphic. We
+will consider adding it back if there is sufficient user demand. It was used
+only for supporting PNG, which is adequately supported by JAI.</li>
+ <li>Extensions which used the fox: namespace in FOP 0.20.5 are supported
+ in FOray under the foray: namespace with the following exceptions:
+ 1) Support for the extension objects foray:outline and foray:label has
+ been dropped in favor of the scheme being considered for
+ <a href="http://www.w3.org/TR/xsl11/#d0e14206">XSL-FO 1.1</a>, which
+ uses bookmark-tree, bookmark, and bookmark-title objects instead.
+ 2) The extension object foray:destination has been dropped. Named
+ destinations are now automatically created for objects with the "id"
+ property.</li>
+ <li>Support for the text-transform property was added.</li>
+ <li>A configuration option "pdf-version" was added. It defaults to "1.6", the
+ latest version of PDF. See
+ <a href="./configuration.html#pdf-version">pdf-version configuration</a>
+ for more information.</li>
+</ul>
+
+<h3>Release 0.2 changes of interest to Developers</h3>
+
+<h4>Modularity Changes</h4>
+<p>As of Release 0.2, all FOray modules are independent of each other, using
+the aXSL interfaces to achieve pluggability and interoperability.
+With the exception of dependencies on the utility-type modules (Common, Pretty,
+and PS), FOray modules now have no dependencies on other FOray modules.</p>
+
+<ul>
+ <li>The following methods have been added to the FontConsumer interface to
+allow the client application to control what combinations of FreeStandingFonts
+and SystemFonts should be used: boolean isUsingFreeStandingFonts(), and
+boolean isUsingSystemFonts().</li>
+ <li>The classes FontInfo and FontTriplet, which represented temporary data
+structures used during the parsing of the font configuration file, have been
+eliminated. Fonts and font descriptions are now registered on-the-fly as the
+font configuration file is parsed.</li>
+ <li>A method FontServer.optimizeFonts(FontConsumer consumer) has been added.
+This method optimizes fonts in preparation for embedding. Currently, the only
+optimization implemented is that subset font glyph indices are sorted by their
+Unicode code point, which potentially saves space in PDF files.
+<i>Caveat: </i>The timing of when this process is run is very critical.
+It should only be run <i>after </i>all glyph indices that are used by
+the FontConsumer have been registered.
+It should also only be run <i>before </i>any glyph indices have been
+written to actual document output. If the client runs the registration
+and writing tasks concurrently, it should not use this method, as doing
+so will corrupt the logical connection between the glyph indices used to
+embed the font and those used to write the document contents. Please note that
+FOray's reference FO processing implementation is currently unable to use this
+method because of such concurrent processing.</li>
+ <li>To assist in parsing Type1 fonts, FOray now has the modest beginnings of
+a PostScript interpreter emulation. It is included as part of the new "FOrayPS"
+module. We hope to use this for future project such as embedding EPS files in
+PDF output.</li>
+ <li>The filter and encoding classes that were part of the pdf module have
+been moved to the new ps module. The theory here is that PDF is kind of a
+superset of PostScript, and that items common to both should live with
+PostScript.</li>
+ <li>FontServer can (and in fact must) now be instantiated, and its variables
+and methods are no longer static (except for a few utility methods). It is still
+intended to be used as a singleton, but there is no requirement that it be so,
+and, with reasonable care, multiple instances should be manageable. A
+getFontServer() method has been added to the FontConsumer interface, as the
+FontConsumer must now know which FontServer it is using.</li>
+ <li>Each of the Font, Graphic, and Text subsystems now has a Server class
+associated with it (e.g. GraphicServer). Instances of these servers may be
+passed to FOraySession's constructor. If not, default instances will be created
+by FOraySession. This means that you can enhance your local system by extending
+these Server classes. Also, because the Servers live outside of the FOray
+application, they can conceivably persist for multiple Sessions.</li>
+ <li>GraphicServer has a new method registerFactory, which accepts one
+GraphicFactory as a parameter. Once registered, a custom factory will be
+consulted as each Graphic is constructed. A GraphicFactory only reads through
+enough of the file contents to determine whether the file is of the type that
+it knows how to create. If so, it can create the Graphic instance. Custom
+factories are consulted before standard factories, allowing developers to
+extend FOrayGraphic capabilities along a new dimension.</li>
+ <li>A new Namespace class has been added in FOTree to manage the
+various issues associated with managing multiple namespaces within the FOTree
+itself. Each Namespace instance is responsible to convert elements and
+attributes in its namespace into FObj and Property instances that can be
+placed in the FOTree. Custom Namespace instances can be registered to the
+FOraySession instance through the registerNamespace() method, which accepts
+one Namespace instance as a parameter.</li>
+</ul>
+
+<!--#include virtual="/00-rsrc/include/leftmenu-end.html" -->
+</body>
+</html>
Property changes on: trunk/foray/doc/web/00-release/notes-0_002.html
___________________________________________________________________
Name: svn:keywords
+ "Author Id Rev Date URL"
Name: svn:eol-style
+ native
Added: trunk/foray/doc/web/00-release/notes-unreleased.html
===================================================================
--- trunk/foray/doc/web/00-release/notes-unreleased.html (rev 0)
+++ trunk/foray/doc/web/00-release/notes-unreleased.html 2007-09-29 23:32:09 UTC (rev 10218)
@@ -0,0 +1,93 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE html
+ PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
+ "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
+
+<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
+
+<head>
+ <title>FOray: Notes, Unreleased Changes</title>
+ <meta name="content-revised"
+ content="$Date$"/>
+ <!--#include virtual="/00-rsrc/include/standard-head.html" -->
+</head>
+
+<body>
+<!--#include virtual="/00-rsrc/include/leftmenu.html" -->
+
+<h1>FOray: Notes, Unreleased Changes</h1>
+
+<p>Unreleased changes can be obtained from the root of the source code
+repository.</p>
+
+<h3>Unreleased changes of interest to Users</h3>
+<ul>
+ <li>FOray is now dependent on Java 5.0 or higher.
+ Previous releases were depending on Java 1.4 or higher.</li>
+ <li>FOray no longer requires JAI libraries to be installed for PNG and TIFF
+ support. These services are now provided natively through open-source
+ libraries packaged with FOray.</li>
+ <li>Tables now work much better (not perfectly).</li>
+ <li>Lists now work much better (not perfectly).</li>
+ <li>Added support for the axsl:metadata extension. See
+ <a href="/app/features/extensions.html#axsl-extensions">aXSL
+ Extensions</a>.</li>
+ <li>Added parsing and validation support for the objects and properties added
+ in XSL-FO 1.1.
+ These are not necessarily used by other FOray modules, but their existence
+ should not cause processing errors.
+ (The bookmark-related objects are fully supported).</li>
+ <li>Significant improvements to expressions in property values.</li>
+ <li>FOray now has limited support for embedding EPS (encapsulated PostScript)
+ files in PDF output. See <a href="../features/graphics.html#eps">EPS
+ Graphics</a> for more details.</li>
+ <li>Support has been discontinued for the "continued-label" object in the
+ "foray" namespace. This general capability has been replaced by new features
+ in XSL-FO 1.1, which will be implemented in FOray as user needs are expressed
+ and as developer resources are available.</li>
+</ul>
+
+<h3>Unreleased changes of interest to Developers</h3>
+<ul>
+ <li>Please review the list of
+ <a href="http://www.axsl.org/rel-notes.html">unreleased changes to aXSL</a>.
+ FOray maintains its API to conform to aXSL's, even between releases.</li>
+ <li>The class org.foray.common.StringUtilPre5 has been removed, and uses of
+ its methods have been replaced by standard Java 5.0 String and Character
+ methods.</li>
+ <li>Completion of javadoc API documentation for the FOrayGraphic,
+ FOrayHyphen-R, FOrayFOTree, FOrayOutput, FOrayLayout, FOrayText, FOrayMIF,
+ FOrayPDF, FOrayArea, FOrayRender, FOrayApp, and FOrayCore modules.
+ All FOray modules now have comprehensive javadocs.</li>
+ <li>The FOrayGraphicServer constructor no longer requires the name of the
+ parser class as a parameter.</li>
+ <li>JUnit tests have been introduced in most modules. Our testing is far
+ from comprehensive, but some good infrastructure is now in place which is
+ already assisting in maintaining stability.</li>
+ <li>Hyphenation pattern files are now named using the 3-character ISO-639
+ language codes instead of the 2-character codes previously used. This change
+ is transparent to the user (they can still use either the 2- or 3-character
+ code in input). This change was made to accommodate languages for which no
+ 2-character code was assigned.</li>
+ <li>FOTree properties have been overhauled a bit.
+ Each property now has its own class, with abstract superclasses where
+ appropriate.
+ This makes storage of the property type redundant, and it has been removed,
+ which should reduce the memory footprint.
+ Also the PropertyList no longer keeps a reference to the parent FObj.</li>
+ <li>The Apache Commons Logging dependency has been upgraded to version
+ 1.1.</li>
+ <li>The Batik libraries (for SVG processing) have been upgraded to 1.6.</li>
+ <li>Batik dependencies have been eliminated from the Common, FOTree, AreaTree,
+ Pioneer Layout, PDF and Renderer modules. All Batik dependencies are now
+ confined to the Graphic package, and are thus effectively hidden behind the
+ aXSL Graphic interfaces. This makes it much more feasible for users to drop in
+ another (perhaps commercial) SVG package. It also makes maintenance of the
+ Batik integration much more straightforward.</li>
+ <li>The Apache XML Graphics Commons dependency has been upgraded to
+ version 1.2.</li>
+</ul>
+
+<!--#include virtual="/00-rsrc/include/leftmenu-end.html" -->
+</body>
+</html>
Property changes on: trunk/foray/doc/web/00-release/notes-unreleased.html
___________________________________________________________________
Name: svn:keywords
+ "Author Id Rev Date URL"
Name: svn:eol-style
+ native
Modified: trunk/foray/doc/web/app/using/release.html
===================================================================
--- trunk/foray/doc/web/app/using/release.html 2007-09-29 22:51:39 UTC (rev 10217)
+++ trunk/foray/doc/web/app/using/release.html 2007-09-29 23:32:09 UTC (rev 10218)
@@ -6,7 +6,7 @@
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
<head>
- <title>FOray: Release History</title>
+ <title>FOray: Release Notes</title>
<meta name="content-revised"
content="$Date$"/>
<!--#include virtual="/00-rsrc/include/standard-head.html" -->
@@ -15,507 +15,20 @@
<body>
<!--#include virtual="/00-rsrc/include/leftmenu.html" -->
-<h1>FOray: Release History</h1>
-<h2>Contents</h2>
-<ul>
- <li><a href="#intro">Introduction</a></li>
- <li><a href="#unreleased">Unreleased Changes</a></li>
- <li><a href="#0_2">Release 0.2</a></li>
- <li><a href="#0_1">Release 0.1</a></li>
-</ul>
+<h1>FOray: Release Notes</h1>
-<h2><a name="intro">Introduction</a></h2>
-<p>All FOray modules and the reference application have concurrent version
-numbers. So, if you are using version x.y of module A, the correct version of
-module B to use with it is also x.y.</p>
+<p>All FOray modules have concurrent version numbers.
+So, if you are using version x.y of module A, the correct version of module B to
+use with it is also x.y.</p>
-<h2><a name="unreleased">Unreleased Changes</a></h2>
-<p>Unreleased changes can be obtained from the root of the source code
-repository.</p>
-
-<h3>Unreleased changes of interest to Users</h3>
<ul>
- <li>FOray is now dependent on Java 5.0 or higher.
- Previous releases were depending on Java 1.4 or higher.</li>
- <li>FOray no longer requires JAI libraries to be installed for PNG and TIFF
- support. These services are now provided natively through open-source
- libraries packaged with FOray.</li>
- <li>Tables now work much better (not perfectly).</li>
- <li>Lists now work much better (not perfectly).</li>
- <li>Added support for the axsl:metadata extension. See
- <a href="/app/features/extensions.html#axsl-extensions">aXSL
- Extensions</a>.</li>
- <li>Added parsing and validation support for the objects and properties added
- in XSL-FO 1.1.
- These are not necessarily used by other FOray modules, but their existence
- should not cause processing errors.
- (The bookmark-related objects are fully supported).</li>
- <li>Significant improvements to expressions in property values.</li>
- <li>FOray now has limited support for embedding EPS (encapsulated PostScript)
- files in PDF output. See <a href="../features/graphics.html#eps">EPS
- Graphics</a> for more details.</li>
- <li>Support has been discontinued for the "continued-label" object in the
- "foray" namespace. This general capability has been replaced by new features
- in XSL-FO 1.1, which will be implemented in FOray as user needs are expressed
- and as developer resources are available.</li>
+ <li><a href="/00-release/notes-unreleased.html">Unreleased Changes</a></li>
+ <li><a href="/00-release/notes-0_002.html">Release Notes for FOray 0.2</a>,
+September 30, 2006.</li>
+ <li><a href="/00-release/notes-0_001.html">Release Notes for FOray 0.1</a>,
+July 29, 2004.</li>
</ul>
-<h3>Unreleased changes of interest to Developers</h3>
-<ul>
- <li>Please review the list of
- <a href="http://www.axsl.org/rel-notes.html">unreleased changes to aXSL</a>.
- FOray maintains its API to conform to aXSL's, even between releases.</li>
- <li>The class org.foray.common.StringUtilPre5 has been removed, and uses of
- its methods have been replaced by standard Java 5.0 String and Character
- methods.</li>
- <li>Completion of javadoc API documentation for the FOrayGraphic,
- FOrayHyphen-R, FOrayFOTree, FOrayOutput, FOrayLayout, FOrayText, FOrayMIF,
- FOrayPDF, FOrayArea, FOrayRender, FOrayApp, and FOrayCore modules.
- All FOray modules now have comprehensive javadocs.</li>
- <li>The FOrayGraphicServer constructor no longer requires the name of the
- parser class as a parameter.</li>
- <li>JUnit tests have been introduced in most modules. Our testing is far
- from comprehensive, but some good infrastructure is now in place which is
- already assisting in maintaining stability.</li>
- <li>Hyphenation pattern files are now named using the 3-character ISO-639
- language codes instead of the 2-character codes previously used. This change
- is transparent to the user (they can still use either the 2- or 3-character
- code in input). This change was made to accommodate languages for which no
- 2-character code was assigned.</li>
- <li>FOTree properties have been overhauled a bit.
- Each property now has its own class, with abstract superclasses where
- appropriate.
- This makes storage of the property type redundant, and it has been removed,
- which should reduce the memory footprint.
- Also the PropertyList no longer keeps a reference to the parent FObj.</li>
- <li>The Apache Commons Logging dependency has been upgraded to version
- 1.1.</li>
- <li>The Batik libraries (for SVG processing) have been upgraded to 1.6.</li>
- <li>Batik dependencies have been eliminated from the Common, FOTree, AreaTree,
- Pioneer Layout, PDF and Renderer modules. All Batik dependencies are now
- confined to the Graphic package, and are thus effectively hidden behind the
- aXSL Graphic interfaces. This makes it much more feasible for users to drop in
- another (perhaps commercial) SVG package. It also makes maintenance of the
- Batik integration much more straightforward.</li>
- <li>The Apache XML Graphics Commons dependency has been upgraded to
- version 1.2.</li>
-</ul>
-
-<h2><a name="0_2">Release 0.2</a></h2>
-<h3>Release 0.2 changes of interest to Users</h3>
-<p class="warning">FOray 0.2 is primarily for developers and module users.
-Although many parts of FOray work well, and are usable as modules, the
-application as a whole has some general problems that require attention.</p>
-
-<p>Here are the more significant of the known issues:</p>
-<ul>
- <li>The FO Tree does not parse all expressions properly. Valid FO syntax
- is sometimes flagged as an error.</li>
- <li>The Area Tree does not handle tables and lists properly in most cases.
- Specifically, these items will likely not be placed in the proper place,
- and may not even be placed in a reasonable place.</li>
-</ul>
-
-<h4>General Configuration Changes</h4>
-<ul>
- <li>The names of several configuration file entries have been changed for
-the sake of consistency and clarity:
- <table>
- <thead>
- <tr>
- <th>Old Name</th>
- <th>New Name</th>
- </tr>
- </thead>
- <tr><td>baseDir</td><td>base-directory</td></tr>
- <tr><td>debugMode</td><td>debug-mode</td></tr>
- <tr><td>dumpConfiguration</td><td>dump-configuration</td></tr>
- <tr><td>fontBaseDir</td><td>font-base-directory</td></tr>
- <tr><td>hyphenation-dir</td><td>hyphenation-base-directory</td></tr>
- <tr><td>stream-filter-list</td><td>pdf-filters</td></tr>
- <tr><td>strokeSVGText</td><td>stroke-svg-text</td></tr>
- </table>
- </li>
- <li>A new command-line option "-so" (session option) has been added to allow
- the entry of any configuration option through the command-line. See
- <a href="configuration.html">FOray Configuration</a> for more details.</li>
- <li>The names of several renderer options (set through command-line options
- or through the API) have been changed for the sake of consistency and
- clarity. In addition, some general command-line options have been replaced
- by general configuration options. Even though the old command-line options
- have been removed, their
- functional equivalent can be attained by using the new "-so" command-line
- option together with the new configuration option.
- <table>
- <thead>
- <tr>
- <th>Old Command-line Option</th>
- <th>Old Renderer Option Name</th>
- <th>New Configuration Key</th>
- </tr>
- </thead>
- <tr>
- <td>-o</td>
- <td>ownerPassword</td>
- <td>pdf-owner-password</td>
- </tr>
- <tr>
- <td>-u</td>
- <td>userPassword</td>
- <td>pdf-user-password</td>
- </tr>
- <tr>
- <td>-noprint</td>
- <td>allowPrint</td>
- <td>pdf-user-print</td>
- </tr>
- <tr>
- <td>-nocopy</td>
- <td>allowCopyContent</td>
- <td>pdf-user-copy</td>
- </tr>
- <tr>
- <td>-noedit</td>
- <td>allowEditContent</td>
- <td>pdf-user-modify</td>
- </tr>
- <tr>
- <td>-noannotations</td>
- <td>allowEditAnnotations</td>
- <td>pdf-user-annotate</td>
- </tr>
- <tr>
- <td>-txt.encoding</td>
- <td>unknown</td>
- <td>txt-encoding</td>
- </tr>
- <tr>
- <td>-s</td>
- <td>unknown</td>
- <td>at-sparse</td>
- </tr>
- <tr>
- <td>-q</td>
- <td>n/a</td>
- <td>verbosity</td>
- </tr>
- <tr>
- <td>-d</td>
- <td>n/a</td>
- <td>verbosity</td>
- </tr>
- <tr>
- <td>-x</td>
- <td>n/a</td>
- <td>dump-configuration (already existed)</td>
- </tr>
- <tr>
- <td>-l</td>
- <td>n/a</td>
- <td>language</td>
- </tr>
- </table>
- </li>
- <li>Configuration options whose value is a list are now entered as
- space-delimited Strings. Previously these were placed in the configuration
- file as value elements inside of a list element. This change was made to
- facilitate making these options available on the command line.</li>
- <li>A boolean configuration option <strong>cache-graphics</strong> was
-added.
-It defaults to "false".
-If set to true, graphic objects will be cached and can be reused, resulting in
-potential speed improvement.
-If set to false, graphic objects are not reused, resulting in potential memory
-savings.</li>
-</ul>
-
-<h4>Font Changes</h4>
-<ul>
- <li>The intermediate font-metrics xml file is no longer used.
-Instead, font resources are read on-the-fly as needed.
-This has resulted in numerous changes to the font configuration file (see
-below).</li>
- <li>Fonts used by the system are no longer read entirely into memory, but are
-read from disk as necessary. This, along with some other improvements, should
-reduce the amount of memory needed for fonts.</li>
- <li>The <code>embed="all"</code> attribute for the <font>
-element in the font configuration
-file is now respected for CID fonts.</li>
- <li>PDF output for CID fonts now includes the ToUnicode CMap. This allows
-cut-and-paste, index, search, etc. to work correctly on PDF text for these
-fonts. This fixes FOP's Bugzilla entry #5335.</li>
- <li>Italic angle information is now parsed from PFM files.</li>
- <li>FontBBox and StemV (StdVW) information is now parsed from PFA and PFB
-files.</li>
- <li>Adobe Font Metrics (AFM) files may now be used for font metrics.
-To use this option, set the font configuration item
-<strong>metrics-file</strong> to point
-to an AFM file. AFM files are now the preferred way to communicate font metrics.
-Note that users can easily modify AFM files to correct bad font information.
-Also AFM files can theoretically be used for bitmapped fonts.</li>
- <li>Kerning has been fixed for subsetted fonts. Kerning now works correctly
-for all supported formats.</li>
- <li>The command-line class PFMReader has been removed.</li>
- <li>The command-line class TTFReader has been changed. It now takes
-either one or two arguments. If given one, and that one is a TrueType collection
-file, the list of fonts within that collection will be written. If given a
-TrueType collection file followed by the name of a font within that file, some
-basic metrics will be parsed and written. If given one argument, and that
-argument is a regular TrueType file, the basic metrics of that file will be
-parsed and written. This tool is useful primarily to see the list of fonts in
-a TTC file, and as a debugging tool if metrics do not seem to be computed
-properly.</li>
- <li>It is now possible to actively prevent the embedding of font files
-whose permissions restrict embedding, and FOray does prevent such
-embedding.</li>
- <li>The <strong>font-base-directory</strong> configuration option is now
-used as a base directory
-for <strong>font-file</strong> and <strong>metrics-file</strong>. It was
-formerly used as a base directory for the
-intermediate xml font metrics files. </li>
-</ul>
-
-<h4>Font Configuration Changes</h4>
-<p>The configuration of fonts has change dramatically. See
-<a href="../features/fonts.html#config">Font Configuration</a> for details on
-the changes listed.</p>
-
-<ul>
- <li>The <strong>server</strong> element has been added.</li>
- <li>The <strong>parameter</strong> element has been added.</li>
- <li>The <strong>glyph-list</strong> element has been added.</li>
- <li>The <strong>encoding</strong> element has been added.</li>
- <li>The <strong>font-family</strong> element has been added.</li>
- <li>The <strong>font-family-alias</strong> element has been added.</li>
- <li>The <strong>font-triplet</strong> element has been renamed to
-<strong>font-description</strong>.
-The font-description element must now be inside a font-family element.</li>
- <li>The <strong>name</strong> attribute of the element
-<strong>font-description</strong> (formerly <strong>font-triplet</strong>) has
-been removed.
-Instead the font-description element is placed inside a font-family element,
-which contains the name.</li>
-</ul>
-
-<p>The following have been added as attributes to the <strong>font</strong>
-element.
-These were formerly command-line options in TTFReader and PFMReader (the
-applications that built the font metrics files):</p>
-
-<ul>
- <li><strong>font-resource</strong> (was the "-er" [embed-resource]
-command-line option).
-Font files can be included in fop.jar when building fop. Both font-resource and
-font-file can be specified for the same font, but if font-file is specified and
-can be opened, it will take precedence over font-resource. Please note that the
-companion option -ef already exists in the font configuration file as
-<strong>embed-file</strong> (now called <strong>font-file</strong>).</li>
- <li><strong>embed-name</strong> (was the "-fn" command-line option). If
-embed-name is
-specified, it will be used as the name of the font in the output document, if
-the font is embedded in that document. This is sometimes useful when trying to
-ensure that the embedded font is used in the document instead of a system font
-with the same name. (Note that if <strong>embed-name</strong> is
-<i>not </i>specified, FOray first
-attempts to extract and embed the name from the font itself. If that fails, it
-uses the configured attribute <strong>name</strong>.</li>
- <li><strong>ttc-name</strong> (was the "-ttcname" command-line option.
-If you are reading data from a TrueType Collection (.ttc
-file) you must specify which font from the collection you will read metrics
-from.</li>
- <li><strong>metrics-file</strong>. This was formerly the pfm-file that
-was designated
-positionally on the command-line. It must now be explicitly identified in the
-configuration file for Type 1 fonts. Please note that this <i>is not </i> the
-same meaning that metrics-file previously had. Before FOray 0.2, the
-metrics-file attribute pointed to the (no-longer-used) intermediate xml file
-that was parsed from font metrics information.</li>
- <li><strong>embed</strong>. This was formerly implied by the use of
-<strong>embed-file</strong> and
-<strong>embed-resource</strong>. Since those attributes have been renamed to
-clarify their use,
-you must now explicitly state what embedding behavior is desired. Valid options
-are "all", "subset", and "none". The default is "none".</li>
-</ul>
-
-<p>In addition, the following changes related to the direct parsing of font files
-have been made:</p>
- <ul>
- <li>The <strong>name</strong> attribute has been added for the element
-<strong>font</strong> in the font
-configuration file. This is currently not used at all, except as a fallback
-name to be embedded in an output file if no other name can be found. Its main
-purpose is for reference in writing and parsing the configuration file, but it
-may be used as a "key" for other configuration elements in the future.</li>
- <li>The former attribute <strong>embed-file</strong> for the element
-<strong>font</strong> in the font
-configuration file, has been renamed to <strong>font-file</strong>.
-This change was made for clarity since, for some font types, the font file
-contains both the embedding information and the metrics information.</li>
- <li>Note that the former TTFReader command-line option "-enc" has no
-analog in the configuration file. The artifical reencoding that this option
-enabled is no longer necessary now that CID fonts are properly embedded in the
-PDF output (see change below regarding ToUnicode CMap).</li>
- </ul>
-
-<h4>Other Changes</h4>
-<ul>
- <li>Since we have now obtained rights to the internet domain "foray.org", the
-package structure of all FOray module classes has been moved from "com.outfitr"
-to "org.foray". The package structure for fop-maint remains unchanged.</li>
- <li>Starting with Release 0.2, FOray requires a minimum java runtime
-environment of 1.4.</li>
- <li><em>All</em> properties should now be supported at the parse level,
-including those properties (like aural properties) that are not directly used
-by any FOray application. This also includes all shorthand properties. Please
-note that this does not mean that all properties are used or used properly by
-the layout or rendering systems.</li>
- <li>The class for running FOray from the command-line has been renamed
-"FOray".</li>
- <li>The outer-layer application API has changed significantly. This will
-not affect command-line users, but will affect anyone running FOray embedded
-within another application. Details of the new API can be found at <a
-href="../../module/app/index.html">The FOray Application</a>, and at the run()
-method in the class CommandLineStarter.</li>
- <li>Support for the Jimi package has been removed from FOrayGraphic. We
-will consider adding it back if there is sufficient user demand. It was used
-only for supporting PNG, which is adequately supported by JAI.</li>
- <li>Extensions which used the fox: namespace in FOP 0.20.5 are supported
- in FOray under the foray: namespace with the following exceptions:
- 1) Support for the extension objects foray:outline and foray:label has
- been dropped in favor of the scheme being considered for
- <a href="http://www.w3.org/TR/xsl11/#d0e14206">XSL-FO 1.1</a>, which
- uses bookmark-tree, bookmark, and bookmark-title objects instead.
- 2) The extension object foray:destination has been dropped. Named
- destinations are now automatically created for objects with the "id"
- property.</li>
- <li>Support for the text-transform property was added.</li>
- <li>A configuration option "pdf-version" was added. It defaults to "1.6", the
- latest version of PDF. See
- <a href="./configuration.html#pdf-version">pdf-version configuration</a>
- for more information.</li>
-</ul>
-
-<h3>Release 0.2 changes of interest to Developers</h3>
-
-<h4>Modularity Changes</h4>
-<p>As of Release 0.2, all FOray modules are independent of each other, using
-the aXSL interfaces to achieve pluggability and interoperability.
-With the exception of dependencies on the utility-type modules (Common, Pretty,
-and PS), FOray modules now have no dependencies on other FOray modules.</p>
-
-<ul>
- <li>The following methods have been added to the FontConsumer interface to
-allow the client application to control what combinations of FreeStandingFonts
-and SystemFonts should be used: boolean isUsingFreeStandingFonts(), and
-boolean isUsingSystemFonts().</li>
- <li>The classes FontInfo and FontTriplet, which represented temporary data
-structures used during the parsing of the font configuration file, have been
-eliminated. Fonts and font descriptions are now registered on-the-fly as the
-font configuration file is parsed.</li>
- <li>A method FontServer.optimizeFonts(FontConsumer consumer) has been added.
-This method optimizes fonts in preparation for embedding. Currently, the only
-optimization implemented is that subset font glyph indices are sorted by their
-Unicode code point, which potentially saves space in PDF files.
-<i>Caveat: </i>The timing of when this process is run is very critical.
-It should only be run <i>after </i>all glyph indices that are used by
-the FontConsumer have been registered.
-It should also only be run <i>before </i>any glyph indices have been
-written to actual document output. If the client runs the registration
-and writing tasks concurrently, it should not use this method, as doing
-so will corrupt the logical connection between the glyph indices used to
-embed the font and those used to write the document contents. Please note that
-FOray's reference FO processing implementation is currently unable to use this
-method because of such concurrent processing.</li>
- <li>To assist in parsing Type1 fonts, FOray now has the modest beginnings of
-a PostScript interpreter emulation. It is included as part of the new "FOrayPS"
-module. We hope to use this for future project such as embedding EPS files in
-PDF output.</li>
- <li>The filter and encoding classes that were part of the pdf module have
-been moved to the new ps module. The theory here is that PDF is kind of a
-superset of PostScript, and that items common to both should live with
-PostScript.</li>
- <li>FontServer can (and in fact must) now be instantiated, and its variables
-and methods are no longer static (except for a few utility methods). It is still
-intended to be used as a singleton, but there is no requirement that it be so,
-and, with reasonable care, multiple instances should be manageable. A
-getFontServer() method has been added to the FontConsumer interface, as the
-FontConsumer must now know which FontServer it is using.</li>
- <li>Each of the Font, Graphic, and Text subsystems now has a Server class
-associated with it (e.g. GraphicServer). Instances of these servers may be
-passed to FOraySession's constructor. If not, default instances will be created
-by FOraySession. This means that you can enhance your local system by extending
-these Server classes. Also, because the Servers live outside of the FOray
-application, they can conceivably persist for multiple Sessions.</li>
- <li>GraphicServer has a new method registerFactory, which accepts one
-GraphicFactory as a parameter. Once registered, a custom factory will be
-consulted as each Graphic is constructed. A GraphicFactory only reads through
-enough of the file contents to determine whether the file is of the type that
-it knows how to create. If so, it can create the Graphic instance. Custom
-factories are consulted before standard factories, allowing developers to
-extend FOrayGraphic capabilities along a new dimension.</li>
- <li>A new Namespace class has been added in FOTree to manage the
-various issues associated with managing multiple namespaces within the FOTree
-itself. Each Namespace instance is responsible to convert elements and
-attributes in its namespace into FObj and Property instances that can be
-placed in the FOTree. Custom Namespace instances can be registered to the
-FOraySession instance through the registerNamespace() method, which accepts
-one Namespace instance as a parameter.</li>
-</ul>
-
-<h2><a name="0_1">Release 0.1</a></h2>
-<p>Release 0.1 can be obtained from the source
-code repository by using the tag <code>rel_0_1_branch</code>.</p>
-<p>Release 0.1 is totally oriented toward modularization of the system, and
-toward improvements to the font system.</p>
-
-<h3>Changes from FOP 0.20.5 to FOray 0.1 of interest to Users</h3>
-<p>Current work is deliberately focused on architecture, not features. However,
-the following improvements have resulted as side-effects of the architectural
-changes:</p>
-<ul>
- <li>Embedded CID fonts had a prefix added to their names in the PDF output
-file. This has been removed.</li>
- <li>Fonts subsetting should now be thread-safe. Each document has its own
-list of characters used for font embedding/subsetting purposes. This needs to
-be tested in a multi-threading environment. (Also, please note that, because
-of limitations in the handling of System fonts which have yet to be resolved,
-some custom coding will probably be required to keep the FontServer running
-across multiple documents.</li>
-</ul>
-<p>In addition, the following changes should be noted:</p>
-<ul>
- <li>All font configuration entries, which used to reside in the FOP
-configuration file, must now be located in a separate file. There is a new
-FOray configuration entry <strong>font-configuration</strong> which should be used
-to point to this
-new file.</li>
- <li>The font configuration file should have a root element of
-<foray-font-config>. The <font> elements are children of the root.
-No other changes have been made to the way fonts are configured.</li>
-</ul>
-
-<h3>Changes from FOP 0.20.5 to FOray 0.1 of interest to Developers</h3>
-<ul>
- <li>Font-related classes have been extracted into a separate module.</li>
- <li>Graphic and PDF-related classes have also been extracted into separate
-modules as a side-effect of getting the Font-related classes extracted.</li>
- <li>Subclass relationships between Font-related classes have been cleaned up
-extensively.</li>
- <li>The relationship between Font concepts and PDF implementation
-concepts has been clarified greatly (probably needs more work).</li>
- <li>Visibility on most Font-related classes has been reduced.</li>
- <li>The following methods in the forayFont class called Font (called
-fonts/FontMetrics in FOP's HEAD) were returning values in millionths of a point:
-getAscender(), getDescender(), getCapHeight(), getXHeight(), and getCharWidth().
-Methods using values returned by these methods were typically dividing the
-returned values by 1000. The Font methods have been changed to return values in
-millipoints (1/1000 of a point), and the corresponding methods in fop-maint have
-been changed to <i>not </i>divide by 1000. Although this makes sense, it may be
-confusing to anyone trying to implement the forayFont module into FOP's
-HEAD.</li>
-</ul>
-
<!--#include virtual="/00-rsrc/include/leftmenu-end.html" -->
</body>
</html>
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <vic...@us...> - 2007-09-29 22:51:37
|
Revision: 10217
http://foray.svn.sourceforge.net/foray/?rev=10217&view=rev
Author: victormote
Date: 2007-09-29 15:51:39 -0700 (Sat, 29 Sep 2007)
Log Message:
-----------
Exclude unnecessary dev files from the build distribution.
Modified Paths:
--------------
trunk/foray/scripts/build.xml
Modified: trunk/foray/scripts/build.xml
===================================================================
--- trunk/foray/scripts/build.xml 2007-09-29 22:05:55 UTC (rev 10216)
+++ trunk/foray/scripts/build.xml 2007-09-29 22:51:39 UTC (rev 10217)
@@ -259,15 +259,18 @@
-->
<copy todir="${bin.staging.lib.dir}">
<fileset dir="${build.dir}" includes="foray*.jar"/>
- <fileset dir="${lib.dir}" includes="**"/>
+ <fileset dir="${lib.dir}" includes="**" excludes=".project"/>
</copy>
<!-- Copy the ancillary distribution files into the staging area. -->
<copy todir="${bin.staging.dir}/${binary.dist.name}">
<fileset dir="${foray.sandbox}" includes="config/**"/>
- <fileset dir="${foray.sandbox}" includes="doc/**"/>
- <fileset dir="${foray.sandbox}" includes="resource/**"/>
- <fileset dir="${foray.sandbox}" includes="scripts/**"/>
+ <fileset dir="${foray.sandbox}" includes="doc/**">
+ <exclude name="doc/.settings/"/>
+ <exclude name="doc/.project"/>
+ </fileset>
+ <fileset dir="${foray.sandbox}" includes="resource/**" excludes="resource/.project"/>
+ <fileset dir="${foray.sandbox}" includes="scripts/**" excludes="scripts/.project"/>
<fileset dir="${foray.sandbox}" includes="readme.txt"/>
</copy>
@@ -303,15 +306,18 @@
area. -->
<copy todir="${module.staging.lib.dir}">
<path refid="distribution.jars"/>
- <fileset dir="${lib.dir}" includes="**"/>
+ <fileset dir="${lib.dir}" includes="**" excludes=".project"/>
</copy>
<!-- Copy the ancillary distribution files into the staging area. -->
<copy todir="${module.staging.dir}/${module.dist.name}">
<fileset dir="${foray.sandbox}" includes="config/**"/>
- <fileset dir="${foray.sandbox}" includes="doc/**"/>
- <fileset dir="${foray.sandbox}" includes="resource/**"/>
- <fileset dir="${foray.sandbox}" includes="scripts/**"/>
+ <fileset dir="${foray.sandbox}" includes="doc/**">
+ <exclude name="doc/.settings/"/>
+ <exclude name="doc/.project"/>
+ </fileset>
+ <fileset dir="${foray.sandbox}" includes="resource/**" excludes="resource/.project"/>
+ <fileset dir="${foray.sandbox}" includes="scripts/**" excludes="scripts/.project"/>
<fileset dir="${foray.sandbox}" includes="readme.txt"/>
</copy>
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <vic...@us...> - 2007-09-29 22:05:52
|
Revision: 10216
http://foray.svn.sourceforge.net/foray/?rev=10216&view=rev
Author: victormote
Date: 2007-09-29 15:05:55 -0700 (Sat, 29 Sep 2007)
Log Message:
-----------
Remove no-longer-needed directory.
Removed Paths:
-------------
trunk/foray/config/schema/
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <vic...@us...> - 2007-09-28 19:46:44
|
Revision: 10215
http://foray.svn.sourceforge.net/foray/?rev=10215&view=rev
Author: victormote
Date: 2007-09-28 12:46:47 -0700 (Fri, 28 Sep 2007)
Log Message:
-----------
Add unimplemented test for centered text with word-spacing.
Modified Paths:
--------------
trunk/foray/foray-app/src/javatest/org/foray/app/area/TestBlock.java
Added Paths:
-----------
trunk/foray/resource/test/fo/block-004.fo
Modified: trunk/foray/foray-app/src/javatest/org/foray/app/area/TestBlock.java
===================================================================
--- trunk/foray/foray-app/src/javatest/org/foray/app/area/TestBlock.java 2007-09-25 17:46:16 UTC (rev 10214)
+++ trunk/foray/foray-app/src/javatest/org/foray/app/area/TestBlock.java 2007-09-28 19:46:47 UTC (rev 10215)
@@ -218,6 +218,7 @@
/**
* Test of fo/block-003.fo.
+ * This is a test of simple centering of a text-area on a line.
* @throws FOrayException For errors creating the FO Tree or Area Tree.
*/
public void testBlock003() throws FOrayException {
@@ -256,11 +257,17 @@
assertTrue(node instanceof TextArea);
final TextArea textArea = (TextArea) node;
- /* The text is "Test of Centering". From the Helvetica AFM file, the
- * widths are as follows: T(611) + e(556) + s(500) + t(278) + space(278)
- * + o(556) + f(278) + space(278) + C(722) + e(556) + n(556) + t(278)
- * + e(556) + r(333) + i(222) + n(556) + g(556) = 7,670. If these are
- * scaled to 12 points, the millipoints used are 7670 * 12 = 92,040. */
+ /* The text is "Test of Centering".
+ * From the Helvetica AFM file, the widths are as follows:
+ * T(611) + e(556) + s(500) + t(278)
+ * + space(278)
+ * + o(556) + f(278)
+ * + space(278)
+ * + C(722) + e(556) + n(556) + t(278) + e(556) + r(333) + i(222)
+ * + n(556) + g(556)
+ * = 7,670.
+ * If these are scaled to 12 points, the millipoints used are 7670 * 12
+ * = 92,040. */
assertEquals(92040, textArea.crIpd());
/* The x value of the text area content rectangle should be at the x
@@ -275,4 +282,76 @@
assertEquals(718800, textArea.crOriginY());
}
+ /**
+ * Test of fo/block-004.fo.
+ * This is a test of centering of a text-area on a line, where the text-area
+ * has word-spacing = ".3em".
+ * @throws FOrayException For errors creating the FO Tree or Area Tree.
+ */
+ public void testBlock004() throws FOrayException {
+ final AreaTreeCreator creator = AreaTreeCreator.getInstance();
+ final AreaTree areaTree = creator.buildAreaTree(
+ "fo/block-004.fo");
+ final NormalFlowRA firstNormalFlowArea = this.getFirstNormalFlowArea(
+ areaTree);
+
+ /* Test location and dimensions of the block area. */
+ AreaNode node = firstNormalFlowArea.getChildAt(0);
+ assertTrue(node instanceof NormalBlockArea);
+ final NormalBlockArea blockArea = (NormalBlockArea) node;
+ /* 1 inch left margin. */
+ assertEquals(72000, blockArea.crOriginX());
+ /* 10 inches from bottom (11 inches high, 1 inch top margin). */
+ assertEquals(720000, blockArea.crOriginY());
+ /* Page is 8.5 inches wide, with 2 inches total margin.
+ * 6.5 * 72,000 = 468,000. */
+ assertEquals(468000, blockArea.crIpd());
+
+ /* Test location and dimensions of the line area. */
+ node = blockArea.getChildAt(0);
+ assertTrue(node instanceof LineArea);
+ final LineArea lineArea = (LineArea) node;
+ /* x same as the parent block. */
+ assertEquals(72000, lineArea.crOriginX());
+ /* y adjusted for half-leading = 12,000 * .2 * .5 = 1200. */
+ assertEquals(718800, lineArea.crOriginY());
+ /* ipd same as parent block. */
+ assertEquals(468000, lineArea.crIpd());
+
+ /* Test location and dimensions of the text area. The key thing we are
+ * testing here is that IT IS CENTERED. */
+ node = lineArea.getChildAt(0);
+ assertTrue(node instanceof TextArea);
+ final TextArea textArea = (TextArea) node;
+
+ /* The text is "Centered with Word Spacing".
+ * From the Helvetica AFM file, the widths are as follows:
+ * C(722) + e(556) + n(556) + t(278) + e(556) + r(333) + e(556) + d(556)
+ * + space(278)
+ * + w(722) + i(222) + t(278) + h(556)
+ * + space(278)
+ * + W(944) + o(556) + r(333) + d(556)
+ * + space(278)
+ * + S(556) + p(556) + a(556) + c(500) + i(222) + n(556) + g(556)
+ * = 12,616.
+ * If these are scaled to 12 points, the millipoints used are 12,616
+ * * 12 = 151,392.
+ * The extra word spacing is .3em = .3 * 12000 * 3 occurrences = 10,800.
+ * Total ipd of text-area = 151,392 + 10,800 = */
+
+ /* TODO: Turn these tests back on after the code has been fixed. */
+// assertEquals(162192, textArea.crIpd());
+
+ /* The x value of the text area content rectangle should be at the x
+ * location of the parent line area + 1/2 of the unused area in the
+ * line. Total line area ipd = 468,000. Unused line area ipd =
+ * 468,000 - 92,040 = 375,960. One half of the unused line area ipd =
+ * 187,980. x = 72,000 + 187,980 = 259,980. */
+// assertEquals(259980, textArea.crOriginX());
+
+ /* The y value of the text area content rectangle should be the same as
+ * the parent line area. */
+ assertEquals(718800, textArea.crOriginY());
+ }
+
}
Added: trunk/foray/resource/test/fo/block-004.fo
===================================================================
--- trunk/foray/resource/test/fo/block-004.fo (rev 0)
+++ trunk/foray/resource/test/fo/block-004.fo 2007-09-28 19:46:47 UTC (rev 10215)
@@ -0,0 +1,31 @@
+<?xml version="1.0" encoding="utf-8"?>
+
+<!--
+This fo tests centering of a text-area a line, where the text-area has word
+spacing.
+-->
+
+<fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format">
+
+<fo:layout-master-set>
+ <fo:simple-page-master
+ master-name="simple"
+ page-height="11in"
+ page-width="8.5in"
+ margin-top="1in"
+ margin-bottom="1in"
+ margin-left="1in"
+ margin-right="1in">
+ <fo:region-body/>
+ </fo:simple-page-master>
+</fo:layout-master-set>
+
+<fo:page-sequence master-reference="simple">
+<fo:flow flow-name="xsl-region-body">
+
+<fo:block font-family="sans-serif" font-size="12pt" text-align="center"
+word-spacing=".3em">Centered with Word Spacing</fo:block>
+
+</fo:flow>
+</fo:page-sequence>
+</fo:root>
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <vic...@us...> - 2007-09-25 17:46:13
|
Revision: 10214
http://foray.svn.sourceforge.net/foray/?rev=10214&view=rev
Author: victormote
Date: 2007-09-25 10:46:16 -0700 (Tue, 25 Sep 2007)
Log Message:
-----------
Add test for line centering.
Modified Paths:
--------------
trunk/foray/foray-app/src/javatest/org/foray/app/area/TestBlock.java
Added Paths:
-----------
trunk/foray/resource/test/fo/block-003.fo
Modified: trunk/foray/foray-app/src/javatest/org/foray/app/area/TestBlock.java
===================================================================
--- trunk/foray/foray-app/src/javatest/org/foray/app/area/TestBlock.java 2007-09-17 15:35:13 UTC (rev 10213)
+++ trunk/foray/foray-app/src/javatest/org/foray/app/area/TestBlock.java 2007-09-25 17:46:16 UTC (rev 10214)
@@ -34,6 +34,7 @@
import org.foray.area.NormalBlockArea;
import org.foray.area.NormalFlowRA;
import org.foray.area.PageCollection;
+import org.foray.area.TextArea;
import org.foray.core.FOrayException;
/**
@@ -215,4 +216,63 @@
assertEquals(testString, documentContent);
}
+ /**
+ * Test of fo/block-003.fo.
+ * @throws FOrayException For errors creating the FO Tree or Area Tree.
+ */
+ public void testBlock003() throws FOrayException {
+ final AreaTreeCreator creator = AreaTreeCreator.getInstance();
+ final AreaTree areaTree = creator.buildAreaTree(
+ "fo/block-003.fo");
+ final NormalFlowRA firstNormalFlowArea = this.getFirstNormalFlowArea(
+ areaTree);
+
+ /* Test location and dimensions of the block area. */
+ AreaNode node = firstNormalFlowArea.getChildAt(0);
+ assertTrue(node instanceof NormalBlockArea);
+ final NormalBlockArea blockArea = (NormalBlockArea) node;
+ /* 1 inch left margin. */
+ assertEquals(72000, blockArea.crOriginX());
+ /* 10 inches from bottom (11 inches high, 1 inch top margin). */
+ assertEquals(720000, blockArea.crOriginY());
+ /* Page is 8.5 inches wide, with 2 inches total margin.
+ * 6.5 * 72,000 = 468,000. */
+ assertEquals(468000, blockArea.crIpd());
+
+ /* Test location and dimensions of the line area. */
+ node = blockArea.getChildAt(0);
+ assertTrue(node instanceof LineArea);
+ final LineArea lineArea = (LineArea) node;
+ /* x same as the parent block. */
+ assertEquals(72000, lineArea.crOriginX());
+ /* y adjusted for half-leading = 12,000 * .2 * .5 = 1200. */
+ assertEquals(718800, lineArea.crOriginY());
+ /* ipd same as parent block. */
+ assertEquals(468000, lineArea.crIpd());
+
+ /* Test location and dimensions of the text area. The key thing we are
+ * testing here is that IT IS CENTERED. */
+ node = lineArea.getChildAt(0);
+ assertTrue(node instanceof TextArea);
+ final TextArea textArea = (TextArea) node;
+
+ /* The text is "Test of Centering". From the Helvetica AFM file, the
+ * widths are as follows: T(611) + e(556) + s(500) + t(278) + space(278)
+ * + o(556) + f(278) + space(278) + C(722) + e(556) + n(556) + t(278)
+ * + e(556) + r(333) + i(222) + n(556) + g(556) = 7,670. If these are
+ * scaled to 12 points, the millipoints used are 7670 * 12 = 92,040. */
+ assertEquals(92040, textArea.crIpd());
+
+ /* The x value of the text area content rectangle should be at the x
+ * location of the parent line area + 1/2 of the unused area in the
+ * line. Total line area ipd = 468,000. Unused line area ipd =
+ * 468,000 - 92,040 = 375,960. One half of the unused line area ipd =
+ * 187,980. x = 72,000 + 187,980 = 259,980. */
+ assertEquals(259980, textArea.crOriginX());
+
+ /* The y value of the text area content rectangle should be the same as
+ * the parent line area. */
+ assertEquals(718800, textArea.crOriginY());
+ }
+
}
Added: trunk/foray/resource/test/fo/block-003.fo
===================================================================
--- trunk/foray/resource/test/fo/block-003.fo (rev 0)
+++ trunk/foray/resource/test/fo/block-003.fo 2007-09-25 17:46:16 UTC (rev 10214)
@@ -0,0 +1,30 @@
+<?xml version="1.0" encoding="utf-8"?>
+
+<!--
+This fo tests basic centering of content on a line.
+-->
+
+<fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format">
+
+<fo:layout-master-set>
+ <fo:simple-page-master
+ master-name="simple"
+ page-height="11in"
+ page-width="8.5in"
+ margin-top="1in"
+ margin-bottom="1in"
+ margin-left="1in"
+ margin-right="1in">
+ <fo:region-body/>
+ </fo:simple-page-master>
+</fo:layout-master-set>
+
+<fo:page-sequence master-reference="simple">
+<fo:flow flow-name="xsl-region-body">
+
+<fo:block font-family="sans-serif" font-size="12pt" text-align="center">Test of
+Centering</fo:block>
+
+</fo:flow>
+</fo:page-sequence>
+</fo:root>
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <vic...@us...> - 2007-09-17 16:33:55
|
Revision: 10213
http://foray.svn.sourceforge.net/foray/?rev=10213&view=rev
Author: victormote
Date: 2007-09-17 08:35:13 -0700 (Mon, 17 Sep 2007)
Log Message:
-----------
Use the formatted page number instead of the ordinal page number.
Modified Paths:
--------------
trunk/foray/foray-areatree/src/java/org/foray/area/PageCollection.java
Modified: trunk/foray/foray-areatree/src/java/org/foray/area/PageCollection.java
===================================================================
--- trunk/foray/foray-areatree/src/java/org/foray/area/PageCollection.java 2007-09-16 18:14:05 UTC (rev 10212)
+++ trunk/foray/foray-areatree/src/java/org/foray/area/PageCollection.java 2007-09-17 15:35:13 UTC (rev 10213)
@@ -222,9 +222,9 @@
newPageNumber);
newPage.setFormattedNumber(formattedPageNumber);
if (!isBlank) {
- getLogger().info("[" + newPageNumber + "]");
+ getLogger().info("[" + formattedPageNumber + "]");
} else {
- getLogger().info("[" + newPageNumber + "] (blank)");
+ getLogger().info("[" + formattedPageNumber + "] (blank)");
}
this.currentPageNumber++;
incrementPageCount();
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <vic...@us...> - 2007-09-16 18:14:04
|
Revision: 10212
http://foray.svn.sourceforge.net/foray/?rev=10212&view=rev
Author: victormote
Date: 2007-09-16 11:14:05 -0700 (Sun, 16 Sep 2007)
Log Message:
-----------
Validate root children as they are added.
Modified Paths:
--------------
trunk/foray/foray-fotree/src/java/org/foray/fotree/fo/obj/Root.java
Modified: trunk/foray/foray-fotree/src/java/org/foray/fotree/fo/obj/Root.java
===================================================================
--- trunk/foray/foray-fotree/src/java/org/foray/fotree/fo/obj/Root.java 2007-09-16 17:51:04 UTC (rev 10211)
+++ trunk/foray/foray-fotree/src/java/org/foray/fotree/fo/obj/Root.java 2007-09-16 18:14:05 UTC (rev 10212)
@@ -115,38 +115,7 @@
* {@inheritDoc}
*/
protected void validateDescendants() throws FoTreeException {
- if (this.getChildCount() < 2) {
- this.throwExceptionContentModelViolation();
- }
- FObj child = this.getChildAt(0);
- if (! (child instanceof LayoutMasterSet)) {
- child.throwException("First child of " + this.getFullName()
- + " must be fo:layout-master-set.");
- }
- boolean declarationsValid = true;
- boolean bookmarkTreeValid = true;
- for (int i = 1; i < this.getChildCount(); i++) {
- child = this.getChildAt(i);
- if (child instanceof Declarations) {
- if (! declarationsValid) {
- child.throwExceptionInvalidLocation();
- }
- declarationsValid = false;
- } else if (child instanceof BookmarkTree) {
- if (! bookmarkTreeValid) {
- child.throwExceptionInvalidLocation();
- }
- declarationsValid = false;
- bookmarkTreeValid = false;
- } else if (child instanceof PageSequence
- || child instanceof PageSequenceWrapper) {
- declarationsValid = false;
- bookmarkTreeValid = false;
- } else {
- child.throwExceptionInvalidLocation();
- }
- }
- return;
+ /* Children validated as they are added. */
}
/**
@@ -438,6 +407,32 @@
* {@inheritDoc}
*/
public void addChild(final FObj child) throws FoTreeException {
+ final OrderedTreeNode lastChild = this.getLastChild();
+ if (child instanceof LayoutMasterSet) {
+ if (lastChild != null) {
+ child.throwExceptionInvalidLocation();
+ }
+ } else if (child instanceof Declarations) {
+ if (! (lastChild instanceof LayoutMasterSet)) {
+ child.throwExceptionInvalidLocation();
+ }
+ } else if (child instanceof BookmarkTree) {
+ if (! (lastChild instanceof LayoutMasterSet)
+ && (! (lastChild instanceof Declarations))) {
+ child.throwExceptionInvalidLocation();
+ }
+ } else if (child instanceof PageSequence
+ || child instanceof PageSequenceWrapper) {
+ if ((! (lastChild instanceof LayoutMasterSet))
+ && (! (lastChild instanceof Declarations))
+ && (! (lastChild instanceof BookmarkTree))
+ && (! (lastChild instanceof PageSequence))
+ && (! (lastChild instanceof PageSequenceWrapper))) {
+ child.throwExceptionInvalidLocation();
+ }
+ } else {
+ child.throwExceptionInvalidLocation();
+ }
this.getChildren().add(child);
}
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <vic...@us...> - 2007-09-16 17:51:05
|
Revision: 10211
http://foray.svn.sourceforge.net/foray/?rev=10211&view=rev
Author: victormote
Date: 2007-09-16 10:51:04 -0700 (Sun, 16 Sep 2007)
Log Message:
-----------
Improve content model testing for Root.
Modified Paths:
--------------
trunk/foray/foray-fotree/src/java/org/foray/fotree/FObj.java
trunk/foray/foray-fotree/src/java/org/foray/fotree/fo/obj/Root.java
Modified: trunk/foray/foray-fotree/src/java/org/foray/fotree/FObj.java
===================================================================
--- trunk/foray/foray-fotree/src/java/org/foray/fotree/FObj.java 2007-09-13 18:58:35 UTC (rev 10210)
+++ trunk/foray/foray-fotree/src/java/org/foray/fotree/FObj.java 2007-09-16 17:51:04 UTC (rev 10211)
@@ -611,6 +611,24 @@
}
/**
+ * Convenience method that throws a general content model violation
+ * exception.
+ * @throws FoTreeException Always, as that is the purpose of this method.
+ */
+ public void throwExceptionContentModelViolation() throws FoTreeException {
+ throwException("Content model violation: " + this.getFullName());
+ }
+
+ /**
+ * Convenience method that throws a general exception indicating that an
+ * element is not valid at this location.
+ * @throws FoTreeException Always, as that is the purpose of this method.
+ */
+ public void throwExceptionInvalidLocation() throws FoTreeException {
+ throwException(this.getFullName() + " is not valid at this location.");
+ }
+
+ /**
* Throws a standard exception when an attempt is made to add content to
* an object that has an empty content model.
* @param child The child object which was attempting to be added to this.
Modified: trunk/foray/foray-fotree/src/java/org/foray/fotree/fo/obj/Root.java
===================================================================
--- trunk/foray/foray-fotree/src/java/org/foray/fotree/fo/obj/Root.java 2007-09-13 18:58:35 UTC (rev 10210)
+++ trunk/foray/foray-fotree/src/java/org/foray/fotree/fo/obj/Root.java 2007-09-16 17:51:04 UTC (rev 10211)
@@ -115,6 +115,37 @@
* {@inheritDoc}
*/
protected void validateDescendants() throws FoTreeException {
+ if (this.getChildCount() < 2) {
+ this.throwExceptionContentModelViolation();
+ }
+ FObj child = this.getChildAt(0);
+ if (! (child instanceof LayoutMasterSet)) {
+ child.throwException("First child of " + this.getFullName()
+ + " must be fo:layout-master-set.");
+ }
+ boolean declarationsValid = true;
+ boolean bookmarkTreeValid = true;
+ for (int i = 1; i < this.getChildCount(); i++) {
+ child = this.getChildAt(i);
+ if (child instanceof Declarations) {
+ if (! declarationsValid) {
+ child.throwExceptionInvalidLocation();
+ }
+ declarationsValid = false;
+ } else if (child instanceof BookmarkTree) {
+ if (! bookmarkTreeValid) {
+ child.throwExceptionInvalidLocation();
+ }
+ declarationsValid = false;
+ bookmarkTreeValid = false;
+ } else if (child instanceof PageSequence
+ || child instanceof PageSequenceWrapper) {
+ declarationsValid = false;
+ bookmarkTreeValid = false;
+ } else {
+ child.throwExceptionInvalidLocation();
+ }
+ }
return;
}
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <vic...@us...> - 2007-09-13 18:58:33
|
Revision: 10210
http://foray.svn.sourceforge.net/foray/?rev=10210&view=rev
Author: victormote
Date: 2007-09-13 11:58:35 -0700 (Thu, 13 Sep 2007)
Log Message:
-----------
Generally remove inline items that have no children.
Modified Paths:
--------------
trunk/foray/foray-areatree/src/java/org/foray/area/BasicLinkArea.java
trunk/foray/foray-areatree/src/java/org/foray/area/BidiOverrideArea.java
trunk/foray/foray-areatree/src/java/org/foray/area/IndexPageCitationListArea.java
trunk/foray/foray-areatree/src/java/org/foray/area/InlineArea.java
trunk/foray/foray-areatree/src/java/org/foray/area/LeaderArea.java
trunk/foray/foray-areatree/src/java/org/foray/area/PageNumberArea.java
trunk/foray/foray-areatree/src/java/org/foray/area/PageNumberCitationArea.java
trunk/foray/foray-areatree/src/java/org/foray/area/PageNumberCitationLastArea.java
Modified: trunk/foray/foray-areatree/src/java/org/foray/area/BasicLinkArea.java
===================================================================
--- trunk/foray/foray-areatree/src/java/org/foray/area/BasicLinkArea.java 2007-09-13 00:19:17 UTC (rev 10209)
+++ trunk/foray/foray-areatree/src/java/org/foray/area/BasicLinkArea.java 2007-09-13 18:58:35 UTC (rev 10210)
@@ -126,6 +126,9 @@
*/
protected boolean optimize() {
this.optimizeChildren();
+ if (this.getChildCount() < 1) {
+ return true;
+ }
return false;
}
Modified: trunk/foray/foray-areatree/src/java/org/foray/area/BidiOverrideArea.java
===================================================================
--- trunk/foray/foray-areatree/src/java/org/foray/area/BidiOverrideArea.java 2007-09-13 00:19:17 UTC (rev 10209)
+++ trunk/foray/foray-areatree/src/java/org/foray/area/BidiOverrideArea.java 2007-09-13 18:58:35 UTC (rev 10210)
@@ -125,6 +125,9 @@
*/
protected boolean optimize() {
this.optimizeChildren();
+ if (this.getChildCount() < 1) {
+ return true;
+ }
return false;
}
Modified: trunk/foray/foray-areatree/src/java/org/foray/area/IndexPageCitationListArea.java
===================================================================
--- trunk/foray/foray-areatree/src/java/org/foray/area/IndexPageCitationListArea.java 2007-09-13 00:19:17 UTC (rev 10209)
+++ trunk/foray/foray-areatree/src/java/org/foray/area/IndexPageCitationListArea.java 2007-09-13 18:58:35 UTC (rev 10210)
@@ -129,6 +129,9 @@
*/
protected boolean optimize() {
this.optimizeChildren();
+ if (this.getChildCount() < 1) {
+ return true;
+ }
return false;
}
Modified: trunk/foray/foray-areatree/src/java/org/foray/area/InlineArea.java
===================================================================
--- trunk/foray/foray-areatree/src/java/org/foray/area/InlineArea.java 2007-09-13 00:19:17 UTC (rev 10209)
+++ trunk/foray/foray-areatree/src/java/org/foray/area/InlineArea.java 2007-09-13 18:58:35 UTC (rev 10210)
@@ -135,6 +135,9 @@
*/
protected boolean optimize() {
this.optimizeChildren();
+ if (this.getChildCount() < 1) {
+ return true;
+ }
return false;
}
Modified: trunk/foray/foray-areatree/src/java/org/foray/area/LeaderArea.java
===================================================================
--- trunk/foray/foray-areatree/src/java/org/foray/area/LeaderArea.java 2007-09-13 00:19:17 UTC (rev 10209)
+++ trunk/foray/foray-areatree/src/java/org/foray/area/LeaderArea.java 2007-09-13 18:58:35 UTC (rev 10210)
@@ -422,6 +422,10 @@
*/
protected boolean optimize() {
this.optimizeChildren();
+ if (this.traitLeaderPattern() == LeaderPattern.USE_CONTENT
+ && this.getChildCount() < 1) {
+ return true;
+ }
return false;
}
Modified: trunk/foray/foray-areatree/src/java/org/foray/area/PageNumberArea.java
===================================================================
--- trunk/foray/foray-areatree/src/java/org/foray/area/PageNumberArea.java 2007-09-13 00:19:17 UTC (rev 10209)
+++ trunk/foray/foray-areatree/src/java/org/foray/area/PageNumberArea.java 2007-09-13 18:58:35 UTC (rev 10210)
@@ -150,6 +150,9 @@
*/
protected boolean optimize() {
this.optimizeChildren();
+ if (this.getChildCount() < 1) {
+ return true;
+ }
return false;
}
Modified: trunk/foray/foray-areatree/src/java/org/foray/area/PageNumberCitationArea.java
===================================================================
--- trunk/foray/foray-areatree/src/java/org/foray/area/PageNumberCitationArea.java 2007-09-13 00:19:17 UTC (rev 10209)
+++ trunk/foray/foray-areatree/src/java/org/foray/area/PageNumberCitationArea.java 2007-09-13 18:58:35 UTC (rev 10210)
@@ -171,6 +171,9 @@
*/
protected boolean optimize() {
this.optimizeChildren();
+ if (this.getChildCount() < 1) {
+ return true;
+ }
return false;
}
Modified: trunk/foray/foray-areatree/src/java/org/foray/area/PageNumberCitationLastArea.java
===================================================================
--- trunk/foray/foray-areatree/src/java/org/foray/area/PageNumberCitationLastArea.java 2007-09-13 00:19:17 UTC (rev 10209)
+++ trunk/foray/foray-areatree/src/java/org/foray/area/PageNumberCitationLastArea.java 2007-09-13 18:58:35 UTC (rev 10210)
@@ -172,6 +172,9 @@
*/
protected boolean optimize() {
this.optimizeChildren();
+ if (this.getChildCount() < 1) {
+ return true;
+ }
return false;
}
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <vic...@us...> - 2007-09-13 00:19:15
|
Revision: 10209
http://foray.svn.sourceforge.net/foray/?rev=10209&view=rev
Author: victormote
Date: 2007-09-12 17:19:17 -0700 (Wed, 12 Sep 2007)
Log Message:
-----------
Convert the unpack method.
Modified Paths:
--------------
trunk/foray/foray-hyphen/src/java/org/foray/hyphen/PatternGenerator.java
Modified: trunk/foray/foray-hyphen/src/java/org/foray/hyphen/PatternGenerator.java
===================================================================
--- trunk/foray/foray-hyphen/src/java/org/foray/hyphen/PatternGenerator.java 2007-09-11 18:54:09 UTC (rev 10208)
+++ trunk/foray/foray-hyphen/src/java/org/foray/hyphen/PatternGenerator.java 2007-09-13 00:19:17 UTC (rev 10209)
@@ -306,6 +306,7 @@
int dot1;
boolean[] morethislevel = new boolean[MAX_DOT + 1];
int trie_root = 1;
+ char cmin = EDGE_OF_WORD;
/* End of Variables from TeX code. */
/** The banner printed when the program starts. */
@@ -675,6 +676,35 @@
}
/**
+ * Finds all transitions associated with the state with base |s|, puts them
+ * into the arrays |trieq_c|, |trieq_l|, and |trieq_r|, and sets |qmax| to
+ * one more than the number of transitions found.
+ * Freed cells are put at the beginning of the free list.
+ * @param s
+ */
+ void unpack(int s) {
+ this.qmax = 1;
+ for(char c = this.cmin; c <= cmax; c++) {
+ /* Search for transitions belonging to this state. */
+ int t = s + c;
+ if (so(trie_char[t]) == c) {
+ /* Found one. */
+ trieqc[qmax] = c;
+ trieql[qmax] = trie_link[t];
+ trieqr[qmax] = trie_back[t];
+ qmax ++;
+ /* Now free trie node. */
+ trie_back[trie_link[0]] = t;
+ trie_link[t] = trie_link[0];
+ trie_link[0] = t;
+ trie_back[t] = 0;
+ trie_char[t] = MIN_PACKED;
+ }
+ }
+ trie_taken[s] = false;
+ }
+
+ /**
* Command-line interface for the {@link PatternGenerator} class.
* @param args The command-line arguments.
* The arguments are:
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <vic...@us...> - 2007-09-11 18:54:06
|
Revision: 10208
http://foray.svn.sourceforge.net/foray/?rev=10208&view=rev
Author: victormote
Date: 2007-09-11 11:54:09 -0700 (Tue, 11 Sep 2007)
Log Message:
-----------
Have the right object throw the exception, so that the context information is meaninful.
Modified Paths:
--------------
trunk/foray/foray-fotree/src/java/org/foray/fotree/fo/obj/Root.java
Modified: trunk/foray/foray-fotree/src/java/org/foray/fotree/fo/obj/Root.java
===================================================================
--- trunk/foray/foray-fotree/src/java/org/foray/fotree/fo/obj/Root.java 2007-09-11 14:49:50 UTC (rev 10207)
+++ trunk/foray/foray-fotree/src/java/org/foray/fotree/fo/obj/Root.java 2007-09-11 18:54:09 UTC (rev 10208)
@@ -271,10 +271,11 @@
return;
}
- // Check for duplicate
+ /* Check for duplicate. */
final Object existingId = this.idMap.get(id);
if (existingId != null && existingId != fobj) {
- throwException("The id \"" + id + "\" has already been assigned.");
+ fobj.throwException("The id \"" + id + "\" has already been "
+ + "assigned.");
}
this.idMap.put(id, fobj);
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <vic...@us...> - 2007-09-11 14:49:46
|
Revision: 10207
http://foray.svn.sourceforge.net/foray/?rev=10207&view=rev
Author: victormote
Date: 2007-09-11 07:49:50 -0700 (Tue, 11 Sep 2007)
Log Message:
-----------
Add the firstfit method and related functions.
Modified Paths:
--------------
trunk/foray/foray-hyphen/src/java/org/foray/hyphen/PatternGenerator.java
Modified: trunk/foray/foray-hyphen/src/java/org/foray/hyphen/PatternGenerator.java
===================================================================
--- trunk/foray/foray-hyphen/src/java/org/foray/hyphen/PatternGenerator.java 2007-09-10 23:35:27 UTC (rev 10206)
+++ trunk/foray/foray-hyphen/src/java/org/foray/hyphen/PatternGenerator.java 2007-09-11 14:49:50 UTC (rev 10207)
@@ -595,6 +595,86 @@
}
/**
+ * The |first_fit| procedure finds a hole in the packed trie into which the
+ * state in |trieq_c|, |trieq_l|, and |trieq_r| will fit. This is normally
+ * done by going through the linked list of unoccupied cells and testing if
+ * the state will fit at each position. However if a state has too many
+ * transitions (and is therefore unlikely to fit among existing
+ * transitions) we don't bother and instead just pack it immediately to the
+ * right of the occupied region (starting at |trie_max+1|).
+ * @return The pointer to the location of the trie node.
+ */
+ int firstfit() {
+ int s = computeBaseLocation();
+ /* Pack it. */
+ for (int q = 1; q <= qmax; q++) {
+ int t = s + trieqc[q];
+ trie_link[trie_back[t]] = trie_link[t];
+ /* Link around filled cell. */
+ trie_back[trie_link[t]] = trie_back[t];
+ trie_char[t] = si(trieqc[q]);
+ trie_link[t] = trieql[q];
+ trie_back[t] = trieqr[q];
+ if (t > trie_max) {
+ trie_max = t;
+ }
+ }
+ trie_taken[s] = true;
+ return s;
+ }
+
+ /**
+ * The threshold for large states is initially 5 transitions.
+ * If more than one level of patterns is being generated, the threshold is
+ * set to 7 on subsequent levels because the pattern trie will be sparser
+ * after bad patterns are deleted (see |delete_bad_patterns|).
+ * @return The value to be used for "s", or -1 if no base location is found.
+ * @see "patgen.web, line 738"
+ */
+ int computeBaseLocation() {
+ int t = 0;
+ if (qmax > qmax_thresh) {
+ t = trie_back[trie_max + 1];
+ }
+ while (true) {
+ t = trie_link[t];
+ /* Get next unoccupied cell. */
+ int s = t - trieqc[1];
+ ensureTrieCapacity(s);
+ if (triectaken[s]) {
+ return -1;
+ }
+ /* Check if state fits here. */
+ for (int q = qmax; q >= 2; q++) {
+ if (trie_char[s + trieqc[q]] != MIN_PACKED) {
+ return -1;
+ }
+ }
+ }
+ }
+
+ /**
+ * The trie is only initialized (as a doubly linked list of empty cells) as
+ * far as necessary. Here we extend the initialization if necessary, and
+ * check for overflow.
+ * @see "patgen.web, line 754"
+ */
+ void ensureTrieCapacity(final int s) {
+ if (s > TRIE_SIZE - NUM_ASCII_CODES) {
+ overflow("pattern trie nodes", TRIE_SIZE);
+ }
+ while (trie_bmax<s) {
+ trie_bmax ++;
+ trie_taken[trie_bmax] = false;
+ trie_char[trie_bmax + LAST_ASCII_CODE] = MIN_PACKED;
+ trie_link[trie_bmax + LAST_ASCII_CODE] = trie_bmax
+ + NUM_ASCII_CODES;
+ trie_back[trie_bmax + NUM_ASCII_CODES] = trie_bmax
+ + LAST_ASCII_CODE;
+ }
+ }
+
+ /**
* Command-line interface for the {@link PatternGenerator} class.
* @param args The command-line arguments.
* The arguments are:
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <vic...@us...> - 2007-09-10 23:35:24
|
Revision: 10206
http://foray.svn.sourceforge.net/foray/?rev=10206&view=rev
Author: victormote
Date: 2007-09-10 16:35:27 -0700 (Mon, 10 Sep 2007)
Log Message:
-----------
Convert initpatterntrie logic.
Modified Paths:
--------------
trunk/foray/foray-hyphen/src/java/org/foray/hyphen/PatternGenerator.java
Modified: trunk/foray/foray-hyphen/src/java/org/foray/hyphen/PatternGenerator.java
===================================================================
--- trunk/foray/foray-hyphen/src/java/org/foray/hyphen/PatternGenerator.java 2007-09-10 22:44:42 UTC (rev 10205)
+++ trunk/foray/foray-hyphen/src/java/org/foray/hyphen/PatternGenerator.java 2007-09-10 23:35:27 UTC (rev 10206)
@@ -229,10 +229,10 @@
char[] xext = new char[WKConstants.MAX_8_BIT_UNSIGNED_VALUES];
char[] xhyf = new char[HYPH_SIZE];
char cmax;
- char[] triec = new char[TRIE_SIZE + 1];
- int[] triel = new int[TRIE_SIZE + 1];
- int[] trier = new int[TRIE_SIZE + 1];
- boolean[] trietaken = new boolean[TRIE_SIZE + 1];
+ char[] trie_char = new char[TRIE_SIZE + 1];
+ int[] trie_link = new int[TRIE_SIZE + 1];
+ int[] trie_back = new int[TRIE_SIZE + 1];
+ boolean[] trie_taken = new boolean[TRIE_SIZE + 1];
char[] triecc = new char[TRIEC_SIZE + 1];
int[] triecl = new int[TRIEC_SIZE + 1];
int[] triecr = new int[TRIEC_SIZE + 1];
@@ -242,11 +242,11 @@
int[] trieql = new int[WKConstants.MAX_8_BIT_UNSIGNED_VALUES];
int[] trieqr = new int[WKConstants.MAX_8_BIT_UNSIGNED_VALUES];
char qmax;
- char qmaxthresh;
- int triemax;
- int triebmax;
- int triecount;
- int opcount;
+ char qmax_thresh;
+ int trie_max;
+ int trie_bmax;
+ int trie_count;
+ int op_count;
char[] pat = new char[MAX_DOT + 1];
int patlen;
int triecmax;
@@ -305,6 +305,7 @@
// int k;
int dot1;
boolean[] morethislevel = new boolean[MAX_DOT + 1];
+ int trie_root = 1;
/* End of Variables from TeX code. */
/** The banner printed when the program starts. */
@@ -534,7 +535,7 @@
* @return The ASCII code corresponding to <code>c</code>.
* @see "patgen.web, line 384"
*/
- char getASCII (char c) {
+ char getASCII(char c) {
char i = xord[c];
if (i == 0) {
boolean found = false;
@@ -568,6 +569,31 @@
System.exit(1);
}
+ void initpatterntrie() {
+ for (char c = 0; c <= LAST_ASCII_CODE; c++) {
+ /* Indicates node occupied; fake for |c=0|. */
+ trie_char[trie_root + c] = si(c);
+ trie_link[trie_root + c] = 0;
+ trie_back[trie_root + c] = 0;
+ trie_taken[trie_root + c] = false;
+
+ trie_taken[trie_root] = true;
+ trie_bmax = trie_root;
+ trie_max = trie_root + LAST_ASCII_CODE;
+ trie_count = NUM_ASCII_CODES;
+ qmax_thresh = 5;
+ /* |trie_link(0)| is used as the head of the doubly linked list of
+ * unoccupied cells. */
+ trie_link[0] = trie_max + 1;
+ trie_back[trie_max + 1] = 0;
+ /* Clear output hash table. */
+ for (int h = 1; h <= MAX_OPS; h++) {
+ ops[h].val = 0;
+ }
+ op_count = 0;
+ }
+ }
+
/**
* Command-line interface for the {@link PatternGenerator} class.
* @param args The command-line arguments.
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <vic...@us...> - 2007-09-10 22:44:38
|
Revision: 10205
http://foray.svn.sourceforge.net/foray/?rev=10205&view=rev
Author: victormote
Date: 2007-09-10 15:44:42 -0700 (Mon, 10 Sep 2007)
Log Message:
-----------
Convert getASCII function.
Modified Paths:
--------------
trunk/foray/foray-hyphen/src/java/org/foray/hyphen/PatternGenerator.java
Modified: trunk/foray/foray-hyphen/src/java/org/foray/hyphen/PatternGenerator.java
===================================================================
--- trunk/foray/foray-hyphen/src/java/org/foray/hyphen/PatternGenerator.java 2007-09-10 21:45:22 UTC (rev 10204)
+++ trunk/foray/foray-hyphen/src/java/org/foray/hyphen/PatternGenerator.java 2007-09-10 22:44:42 UTC (rev 10205)
@@ -124,6 +124,9 @@
/** The highest allowed ASCII_code value. */
static final short LAST_ASCII_CODE = 255;
+ /** Number of different |ASCII_code| values. */
+ static final short NUM_ASCII_CODES = LAST_ASCII_CODE + 1;
+
/** Character class constant for the character ' '. */
static final byte SPACE_CLASS = 0;
@@ -349,6 +352,10 @@
initialize();
}
+ /**
+ * Initializes the various data structures.
+ * @see "patgen.web, line 313"
+ */
void initialize() {
/* We want to make sure that the constants defined in this program
* satisfy all the required relations. Some of them are needed to avoid
@@ -499,6 +506,7 @@
* Converts |ASCII_code| to |packed_ASCII_code|.
* @param inputChar The ASCII_code to be converted.
* @return The corresponding packed_ASCII_code.
+ * @see "patgen.web, line 307"
*/
private char si(char inputChar) {
/* TODO: This method can probably be removed since we don't have the
@@ -510,6 +518,7 @@
* Converts |packed_ASCII_code| to |ASCII_code|.
* @param inputChar The packed_ASCII_code to be converted.
* @return The corresponding ASCII_code;
+ * @see "patgen.web, line 308"
*/
private char so(char inputChar) {
/* TODO: This method can probably be removed since we don't have the
@@ -518,6 +527,48 @@
}
/**
+ * Returns the |ASCII_code| corresponding to a character, assigning a new
+ * |ASCII_code| first if necessary.
+ * Used only while reading the |translate| file.
+ * @param c
+ * @return The ASCII code corresponding to <code>c</code>.
+ * @see "patgen.web, line 384"
+ */
+ char getASCII (char c) {
+ char i = xord[c];
+ if (i == 0) {
+ boolean found = false;
+ while (i < LAST_ASCII_CODE) {
+ i++;
+ if ((xchr[i] == ' ')
+ && (i != ' ')) {
+ found = true;
+ break;
+ }
+ }
+ if (found) {
+ xord [c] = i;
+ xchr [i] = c;
+ } else {
+ overflow("characters", NUM_ASCII_CODES);
+ }
+ }
+ return i;
+ }
+
+ /**
+ * Standard overflow logic.
+ * @param message The specifics of what has overflowed.
+ * @param capacity The capacity of the item that has overflowed.
+ * @see "patgen.web, line 256"
+ */
+ private void overflow(final String message, final int capacity) {
+ this.logger.error("PatternGenerator capacity exceeded: " + message
+ + " (" + capacity + ").");
+ System.exit(1);
+ }
+
+ /**
* Command-line interface for the {@link PatternGenerator} class.
* @param args The command-line arguments.
* The arguments are:
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <vic...@us...> - 2007-09-10 21:45:19
|
Revision: 10204
http://foray.svn.sourceforge.net/foray/?rev=10204&view=rev
Author: victormote
Date: 2007-09-10 14:45:22 -0700 (Mon, 10 Sep 2007)
Log Message:
-----------
Convert the initialization code.
Modified Paths:
--------------
trunk/foray/foray-hyphen/src/java/org/foray/hyphen/PatternGenerator.java
Modified: trunk/foray/foray-hyphen/src/java/org/foray/hyphen/PatternGenerator.java
===================================================================
--- trunk/foray/foray-hyphen/src/java/org/foray/hyphen/PatternGenerator.java 2007-09-10 18:44:19 UTC (rev 10203)
+++ trunk/foray/foray-hyphen/src/java/org/foray/hyphen/PatternGenerator.java 2007-09-10 21:45:22 UTC (rev 10204)
@@ -55,6 +55,8 @@
import org.foray.common.WKConstants;
+import org.apache.commons.logging.Log;
+
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
@@ -113,22 +115,73 @@
/** Maximum number of levels + 1. Also used to denote bad patterns. */
static final byte MAX_VAL = 10;
+ /** Ordinal number of the smallest element of text_char. */
+ static final byte FIRST_TEXT_CHAR = 0;
+
+ /** Ordinal number of the largest element of text_char. */
+ static final short LAST_TEXT_CHAR = 255;
+
+ /** The highest allowed ASCII_code value. */
+ static final short LAST_ASCII_CODE = 255;
+
+ /** Character class constant for the character ' '. */
+ static final byte SPACE_CLASS = 0;
+
+ /** Character class constant for the characters '0' thru '9'. */
+ static final byte DIGIT_CLASS = 1;
+
+ /** Character class constant for the "hyphen" characters, '.', '-', and
+ * '*'. */
+ static final byte HYF_CLASS = 2;
+
+ /** Character class constant for the characters that are letters, that is,
+ * 'a' thru 'z' and 'A' thru 'Z'. */
+ static final byte LETTER_CLASS = 3;
+
+ /** Character class constant for characters that start a multi-character
+ * sequence representing a letter. */
+ static final byte ESCAPE_CLASS = 4;
+
+ /** Character class constant for characters that normally should not
+ * occur. */
+ static final byte INVALID_CLASS = 5;
+
+ /** Constant indicating "no hyphen". */
+ static final byte NO_HYF = 0;
+
+ /** Constant indicating "erroneous hyphen". */
+ static final byte ERR_HYF = 1;
+
+ /** Constant indicating "hyphen". */
+ static final byte IS_HYF = 2;
+
+ /** Constant indicating "found hyphen". */
+ static final byte FOUND_HYF = 3;
+
+ /** Change this constant to -128 when necessary, and don't forget to change
+ * the definitions of {@link #si(char)} and {@link #so(char)} below
+ * accordingly. */
+ static final byte MIN_PACKED = 0;
+
/** A magic number for the size of the array holding the filename. */
private static final int FILENAME_SIZE = 9;
/** A magic number of unknown significance. */
private static final int HYPH_SIZE = 4;
- /** A magic number of unknown significance. */
+ /** |internal_code| for start and end of a word. */
+ private static final byte EDGE_OF_WORD = 1;
+
+ /** The number of characters that are digits, that is, '0' thru '9'. */
private static final int DIGITS_SIZE = 10;
/** Space for pattern trie. */
- private static final int TRIE_SIZE = 550000;
+ private static final int TRIE_SIZE = 55000;
/** Space for pattern count trie, must be less than {@link #TRIE_SIZE} and
* greater than the number of occurrences of any pattern in the
* dictionary. */
- private static final int TRIEC_SIZE = 260000;
+ private static final int TRIEC_SIZE = 26000;
/** Size of output hash table, should be a multiple of 510. */
private static final short MAX_OPS = 4080;
@@ -161,7 +214,11 @@
int goodwt;
int badwt;
int thresh;
+
+ /** Specifies conversion of input characters. */
char[] xord = new char[WKConstants.MAX_8_BIT_UNSIGNED_VALUES];
+
+ /** Specifies conversion of output characters. */
char[] xchr = new char[WKConstants.MAX_8_BIT_UNSIGNED_VALUES];
char[] xclass = new char[WKConstants.MAX_8_BIT_UNSIGNED_VALUES];
char[] xint = new char[WKConstants.MAX_8_BIT_UNSIGNED_VALUES];
@@ -240,9 +297,9 @@
int n1;
int n2;
int n3;
- int i;
- int j;
- int k;
+// int i;
+// int j;
+// int k;
int dot1;
boolean[] morethislevel = new boolean[MAX_DOT + 1];
/* End of Variables from TeX code. */
@@ -256,6 +313,8 @@
/** The output stream. */
private OutputStream output;
+ private Log logger;
+
/**
* Constructor.
* @param input The input stream to be read.
@@ -287,10 +346,178 @@
* patterns.
*/
private void patgen() {
+ initialize();
+ }
+ void initialize() {
+ /* We want to make sure that the constants defined in this program
+ * satisfy all the required relations. Some of them are needed to avoid
+ * time-consuming checks while processing the dictionary and/or to
+ * prevent range check and array bound violations. */
+
+ int bad = 0;
+ if (LAST_ASCII_CODE < WKConstants.MAX_7_BIT_UNSIGNED_VALUES - 1) {
+ bad = 1 ;
+ }
+ if ((si((char) 0) != MIN_PACKED)
+ || (so((char) MIN_PACKED) != 0)) {
+ bad = 2 ;
+ }
+ if ((TRIEC_SIZE < 4096)
+ || (TRIE_SIZE < TRIEC_SIZE)) {
+ bad = 3;
+ }
+ if (MAX_OPS > TRIE_SIZE) {
+ bad = 4;
+ }
+ if (MAX_VAL > 10) {
+ bad = 5;
+ }
+ if (MAX_BUF_LEN < MAX_LEN) {
+ bad = 6;
+ }
+ if (bad > 0) {
+ this.logger.error("Bad constants---case " + bad) ;
+ System.exit(1);
+ }
+
+ for (int j = 0; j < xchr.length; j++) {
+ xchr [j]= ' ';
+ }
+ xchr [46]= '.';
+ xchr [48]= '0';
+ xchr [49]= '1';
+ xchr [50]= '2';
+ xchr [51]= '3';
+ xchr [52]= '4';
+ xchr [53]= '5';
+ xchr [54]= '6';
+ xchr [55]= '7';
+ xchr [56]= '8';
+ xchr [57]= '9';
+ xchr [65]= 'A';
+ xchr [66]= 'B';
+ xchr [67]= 'C';
+ xchr [68]= 'D';
+ xchr [69]= 'E';
+ xchr [70]= 'F';
+ xchr [71]= 'G';
+ xchr [72]= 'H';
+ xchr [73]= 'I';
+ xchr [74]= 'J';
+ xchr [75]= 'K';
+ xchr [76]= 'L';
+ xchr [77]= 'M';
+ xchr [78]= 'N';
+ xchr [79]= 'O';
+ xchr [80]= 'P';
+ xchr [81]= 'Q';
+ xchr [82]= 'R';
+ xchr [83]= 'S';
+ xchr [84]= 'T';
+ xchr [85]= 'U';
+ xchr [86]= 'V';
+ xchr [87]= 'W';
+ xchr [88]= 'X';
+ xchr [89]= 'Y';
+ xchr [90]= 'Z';
+ xchr [97]= 'a';
+ xchr [98]= 'b';
+ xchr [99]= 'c';
+ xchr [100]= 'd';
+ xchr [101]= 'e';
+ xchr [102]= 'f';
+ xchr [103]= 'g';
+ xchr [104]= 'h';
+ xchr [105]= 'i';
+ xchr [106]= 'j';
+ xchr [107]= 'k';
+ xchr [108]= 'l';
+ xchr [109]= 'm';
+ xchr [110]= 'n';
+ xchr [111]= 'o';
+ xchr [112]= 'p';
+ xchr [113]= 'q';
+ xchr [114]= 'r';
+ xchr [115]= 's';
+ xchr [116]= 't';
+ xchr [117]= 'u';
+ xchr [118]= 'v';
+ xchr [119]= 'w';
+ xchr [120]= 'x';
+ xchr [121]= 'y';
+ xchr [122]= 'z';
+
+ /* The following system-independent code makes the xord array contain
+ * a suitable inverse to the information in xchr. */
+
+ /* ASCII_code that should not appear. */
+ char invalid_code = 0;
+
+ /* Ordinal index for the tab character. Tab characters seem to be
+ * unavoidable with files from UNIX systems. */
+ char tab_char = '\t';
+
+ for (int i = FIRST_TEXT_CHAR; i < LAST_TEXT_CHAR; i++) {
+ xord[i] = invalid_code;
+ }
+
+ for (char j = 0; j <= LAST_ASCII_CODE; j++) {
+ xord[xchr[j]] = j;
+ }
+ xord[' '] = ' ';
+ xord[tab_char] = ' ';
+
+ /* Initialize the xclass and xint arrays with default values. Both are
+ * modified by other initialization routines below. */
+ for (int i = FIRST_TEXT_CHAR; i <= LAST_TEXT_CHAR; i++) {
+ xclass[i] = INVALID_CLASS;
+ xint[i] = 0;
+ }
+ xclass[' '] = SPACE_CLASS;
+
+ /* Initialize the xext array. */
+ for (int j = 0; j <= LAST_ASCII_CODE; j++) {
+ xext[j] = ' ';
+ }
+ xext[EDGE_OF_WORD]= '.' ;
+
+ /* Initialize the xdig array. */
+ for (char j = 0; j < xdig.length; j++) {
+ xdig[j] = xchr[j + '0'];
+ xclass[xdig[j]] = DIGIT_CLASS;
+ xint[xdig[j]] = j;
+ }
+
+ /* Initialize the xhyf array. */
+ xhyf[ERR_HYF]= '.' ;
+ xhyf[IS_HYF]= '-' ;
+ xhyf[FOUND_HYF]= '*' ;
}
/**
+ * Converts |ASCII_code| to |packed_ASCII_code|.
+ * @param inputChar The ASCII_code to be converted.
+ * @return The corresponding packed_ASCII_code.
+ */
+ private char si(char inputChar) {
+ /* TODO: This method can probably be removed since we don't have the
+ * same resource constraint issues as Pascal does. */
+ return inputChar;
+ }
+
+ /**
+ * Converts |packed_ASCII_code| to |ASCII_code|.
+ * @param inputChar The packed_ASCII_code to be converted.
+ * @return The corresponding ASCII_code;
+ */
+ private char so(char inputChar) {
+ /* TODO: This method can probably be removed since we don't have the
+ * same resource constraint issues as Pascal does. */
+ return inputChar;
+ }
+
+ /**
* Command-line interface for the {@link PatternGenerator} class.
* @param args The command-line arguments.
* The arguments are:
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|
|
From: <vic...@us...> - 2007-09-10 18:44:16
|
Revision: 10203
http://foray.svn.sourceforge.net/foray/?rev=10203&view=rev
Author: victormote
Date: 2007-09-10 11:44:19 -0700 (Mon, 10 Sep 2007)
Log Message:
-----------
Add the original patgen.web file to the directory temporarily, and remove the similar commented code from the PatternGenerator class.
Modified Paths:
--------------
trunk/foray/foray-hyphen/src/java/org/foray/hyphen/PatternGenerator.java
Added Paths:
-----------
trunk/foray/foray-hyphen/src/java/org/foray/hyphen/patgen.web
Modified: trunk/foray/foray-hyphen/src/java/org/foray/hyphen/PatternGenerator.java
===================================================================
--- trunk/foray/foray-hyphen/src/java/org/foray/hyphen/PatternGenerator.java 2007-09-10 18:01:28 UTC (rev 10202)
+++ trunk/foray/foray-hyphen/src/java/org/foray/hyphen/PatternGenerator.java 2007-09-10 18:44:19 UTC (rev 10203)
@@ -100,6 +100,11 @@
*/
public class PatternGenerator {
/* TODO: Turn off checkstyle suppression for this class. */
+ /* TODO: Remove patgen.c and patgen.web files from this directory after
+ * conversion has been completed. */
+ /* TODO: This class probably has overlap with the TernaryTree class in this
+ * package. Duplicate code should be removed from this class and the
+ * TernaryTree class should be used instead. */
/* Start of Constants from TeX code. */
/* TODO: After conversion is complete, make all of these constants and
@@ -314,1889 +319,4 @@
generator.process();
}
-/*
- The program uses \PASCAL's standard |input| and |output| files to read
- from and write to the user's terminal.
-
- @d print(#)==write(output,#)
- @d print_ln(#)==write_ln(output,#)
- @d get_input(#)==read(input,#)
- @d get_input_ln(#)==
- begin if eoln(input) then read_ln(input);
- read(input,#);
- end
- @#
- @d end_of_PATGEN=9999
-
- @p @<Compiler directives@>@/
- program PATGEN(@!dictionary,@!patterns,@!translate,@!patout);
- label end_of_PATGEN;
- const @<Constants in the outer block@>@/
- type @<Types in the outer block@>@/
- var @<Globals in the outer block@>@/
- procedure initialize; {this procedure gets things started properly}
- var @<Local variables for initialization@>@/
- begin print_ln(banner);@/
- @<Set initial values@>@/
- end;
-
- @ The patterns are generated in a series of sequential passes through the
- dictionary. In each pass, we collect count statistics for a particular
- type of pattern, taking into account the effect of patterns chosen in
- previous passes. At the end of a pass, the counts are examined and new
- patterns are selected.
-
- Patterns are chosen one level at a time, in order of increasing
- hyphenation value. In the sample run shown below, the parameters
- |hyph_start| and |hyph_finish| specify the first and last levels,
- respectively, to be generated.
-
- Patterns at each level are chosen in order of increasing pattern length
- (usually starting with length~2). This is controlled by the parameters
- |pat_start| and |pat_finish| specified at the beginning of each level.
-
- Furthermore patterns of the same length applying to different
- intercharacter positions are chosen in separate passes through the
- dictionary. Since patterns of length $n$ may apply to $n+1$ different
- positions, choosing a set of patterns of lengths $2$ through $n$ for a
- given level requires $(n+1)(n+2)/2-3$ passes through the word list.
-
- At each level, the selection of patterns is controlled by the three
- parameters |good_wt|, |bad_wt|, and |thresh|. A hyphenating pattern will
- be selected if |good*good_wt-bad*bad_wt>=thresh|, where |good| and
- |bad| are the number of times the pattern could and could not be
- hyphenated, respectively, at a particular point. For inhibiting patterns,
- |good| is the number of errors inhibited, and |bad| is the number of
- previously found hyphens inhibited.
-
- @<Globals...@>=
- @!pat_start, @!pat_finish: dot_type;
- @!hyph_start, @!hyph_finish: val_type;
- @!good_wt, @!bad_wt, @!thresh: integer;
-
- @ The proper choice of the parameters to achieve a desired degree of
- hyphenation is discussed in Chapter~4. Below we show part of a sample run
- of \.{PATGEN}, with the user's inputs underlined.
- $$\vbox{\halign{\.{#\hfil}\cr
- $\\underline{\smash{\.{ex patgen}}}$\cr
- DICTIONARY : $\\underline{\smash{\.{murray.hyf}}}$\cr
- PATTERNS : $\\underline{\smash{\.{nul:}}}$\cr
- TRANSLATE : $\\underline{\smash{\.{nul:}}}$\cr
- PATOUT : $\\underline{\smash{\.{murray.pat}}}$\cr
- This is PATGEN, Version 2.0\cr
- left\_hyphen\_min = 2, right\_hyphen\_min = 3, 26 letters\cr
- 0 patterns read in\cr
- pattern trie has 256 nodes, trie\_max = 256, 0 outputs\cr
- hyph\_start, hyph\_finish: $\\underline{\.{1 1}}$\cr
- pat\_start, pat\_finish: $\\underline{\.{2 3}}$\cr
- good weight, bad weight, threshold: $\\underline{\.{1 3 3}}$\cr
- processing dictionary with pat\_len = 2, pat\_dot = 1\cr
- \cr
- 0 good, 0 bad, 3265 missed\cr
- 0.00 \%, 0.00 \%, 100.00 \%\cr
- 338 patterns, 466 nodes in count trie, triec\_max = 983\cr
- 46 good and 152 bad patterns added (more to come)\cr
- finding 715 good and 62 bad hyphens, efficiency = 10.72\cr
- pattern trie has 326 nodes, trie\_max = 509, 2 outputs\cr
- processing dictionary with pat\_len = 2, pat\_dot = 0\cr
- \cr
- \hskip 1.5em ...\cr
- \cr
- 1592 nodes and 39 outputs deleted\cr
- total of 220 patterns at hyph\_level 1\cr
- hyphenate word list? $\\underline{\smash{\.{y}}}$\cr
- writing pattmp.1\cr
- \cr
- 2529 good, 243 bad, 736 missed\cr
- 77.46 \%, 7.44 \%, 22.54 \%\cr}}$$
-
- @ Note that before beginning a pattern selection run, a file of existing
- patterns may be read in. In order for pattern selection to work properly,
- this file should only contain patterns with hyphenation values less than
- |hyph_start|. Each word in the dictionary is hyphenated according to the
- existing set of patterns (including those chosen on previous passes of the
- current run) before pattern statistics are collected.
-
- Also, a hyphenated word list may be written out at the end of a run. This
- list can be read back in as the `dictionary' to continue pattern selection
- from this point. In addition to ordinary hyphens (|'-'|) the new list
- will contain two additional kinds of ``hyphens'' between letters, namely
- hyphens that have been found by previously generated patterns, as well
- as erroneous hyphens that have been inserted by those patterns. These
- are represented by the symbols |'*'| and |'.'|, respectively. The three
- characters |'-'|, |'*'|, and |'.'| are, in fact, just the default values
- used to represent the three kinds of hyphens, the |translate| file may
- specify different characters to be used instead of them.
-
- In addition, a word list can include hyphen weights, both for entire words
- and for individual hyphen positions. (The syntax for this is explained in
- the dictionary processing routines.) Thus common words can be weighted
- more heavily, or, more generally, words can be weighted according to their
- frequency of occurrence, if such information is available. The use of
- hyphen weights combined with an appropriate setting of the pattern
- selection threshold can be used to guarantee hyphenation of certain words
- or certain hyphen positions within a word.
-
- @ Below we show the first few lines of a typical word list,
- before and after generating a level of patterns.
- $$\vbox{\halign{\tabskip 1in\.{#\hfil}&\.{#\hfil}\cr
- abil-i-ty& abil*i*ty\cr
- ab-sence& ab*sence\cr
- ab-stract& ab*stract\cr
- ac-a-dem-ic& ac-a-d.em-ic\cr
- ac-cept& ac*cept\cr
- ac-cept-able& ac*cept-able\cr
- ac-cept-ed& ac*cept*ed\cr
- \hskip 1.5em ...&\hskip 1.5em ...\cr
- }}$$
-
- @ We augment \PASCAL 's control structures a bit using |goto|\\unskip's
- and the following symbolic labels.
-
- @d exit=10 {go here to leave a procedure}
- @d continue=22 {go here to resume a loop}
- @d done=30 {go here to exit a loop}
- @d found=40 {go here when you've found it}
- @d not_found=41 {go here when you've found something else}
-
- @ Here are some macros for common programming idioms.
-
- @d incr(#)==#:=#+1 {increase a variable by unity}
- @d decr(#)==#:=#-1 {decrease a variable by unity}
- @#
- @d Incr_Decr_end(#)==#
- @d Incr(#)==#:=#+Incr_Decr_end {we use |Incr(a)(b)| to increase \dots}
- @d Decr(#)==#:=#-Incr_Decr_end {\dots\ and |Decr(a)(b)| to decrease
- variable |a| by |b|; this can be optimized for some compilers}
- @#
- @d loop == @+ while true do@+ {repeat over and over until a |goto| happens}
- @d do_nothing == {empty statement}
- @d return==goto exit {terminate a procedure call}
- @f return==nil
- @f loop == xclause
-
- @ In case of serious problems \.{PATGEN} will give up, after issuing an
- error message about what caused the error. Such errors might be
- discovered inside of subroutines inside of subroutines, so a \.{WEB}
- macro called |jump_out| has been introduced. This macro, which transfers
- control to the label |end_of_PATGEN| at the end of the program, contains
- the only non-local |@!goto| statement in \.{PATGEN}. Some \PASCAL\
- compilers do not implement non-local |goto| statements. In such cases
- the |goto end_of_PATGEN| in the definition of |jump_out| should simply
- be replaced by a call on some system procedure that quietly terminates
- the program.
-
- An overflow stop occurs if \.{PATGEN}'s tables aren't large enough.
-
- @d jump_out==goto end_of_PATGEN {terminates \.{PATGEN}}
- @#
- @d error(#)==begin print_ln(#); jump_out; end
- @d overflow(#)==error('PATGEN capacity exceeded, sorry [',#,'].')
- @.PATGEN capacity exceeded ...@>
-
- @* The character set.
- Since different \PASCAL\ systems may use different character sets, we use
- the name |text_char| to stand for the data type of characters appearing in
- external text files. We also assume that |text_char| consists of the
- elements |chr(first_text_char)| through |chr(last_text_char)|, inclusive.
- The definitions below should be adjusted if necessary.
- @^character set dependencies@>
-
- Internally, characters will be represented using the type |ASCII_code|.
- Note, however, that only some of the standard ASCII characters are
- assigned a fixed |ASCII_code|; all other characters are assigned an
- |ASCII_code| dynamically when they are first read from the |translate|
- file specifying the external representation of the `letters' used by a
- particular language. For the sake of generality the standard version of
- this program allows for 256 different |ASCII_code| values, but 128 of
- them would probably suffice for all practical purposes.
-
- @d first_text_char=0 {ordinal number of the smallest element of |text_char|}
- @d last_text_char=255 {ordinal number of the largest element of |text_char|}
- @#
- @d last_ASCII_code=255 {the highest allowed |ASCII_code| value}
-
- @<Types...@>=
- @!text_char=char; {the data type of characters in text files}
- @!ASCII_code=0..last_ASCII_code; {internal representation of input
- characters}
- @!text_file=text;
-
- @ Some \PASCAL s can store only signed eight-bit quantities (|-128..127|)
- but not unsigned ones (|0..255|) in one byte. If storage is tight we
- must, for such \PASCAL s, either restrict |ASCII_code| to the range
- |0..127| (with some loss of generality) or convert between |ASCII_code|
- and |packed_ASCII_code| and vice versa by subtracting or adding an
- offset. (Or we might define |packed_ASCII_code| as |char| and use
- suitable typecasts for the conversion.) Only the type |packed_ASCII_code|
- will be used for large arrays and the \.{WEB} macros |si| and |so| will
- always be used to convert an |ASCII_code| into a |packed_ASCII_code| and
- vice versa.
-
- @d min_packed=0 {change this to `$\\{min\_packed}=-128$' when necessary;
- and don't forget to change the definitions of |si| and |so| below
- accordingly}
- @#
- @d si(#)==# {converts |ASCII_code| to |packed_ASCII_code|}
- @d so(#)==# {converts |packed_ASCII_code| to |ASCII_code|}
-
- @<Types...@>=
- @!packed_ASCII_code=min_packed..last_ASCII_code+min_packed;
-
- @ We want to make sure that the "constants" defined in this program
- satisfy all the required relations. Some of them are needed to avoid
- time-consuming checks while processing the dictionary and/or to
- prevent range check and array bound violations.
-
- Here we check that the definitions of |ASCII_code| and
- |packed_ASCII_code| are consistent with those of |si| and |so|.
-
- @<Set init...@>=
- bad:=0;@/
- if last_ASCII_code<127 then bad:=1;
- if (si(0)<>min_packed)or(so(min_packed)<>0) then bad:=2;@/
- @<Check the ``constant'' values for consistency@>@;
- if bad>0 then error('Bad constants---case ',bad:1);
- @.Bad constants@>
-
- @ @<Local variables for init...@>=
- @!bad:integer;
- @!i:text_char;
- @!j:ASCII_code;
-
- @ We convert between |ASCII_code| and the user's external character set by
- means of arrays |xord| and |xchr| that are analogous to \PASCAL's |ord|
- and |chr| functions.
-
- @<Globals...@>=
- @!xord: array [text_char] of ASCII_code;
- {specifies conversion of input characters}
- @!xchr: array [ASCII_code] of text_char;
- {specifies conversion of output characters}
-
- @ The following code initializes the |xchr| array with some of the
- standard ASCII characters.
-
- @<Set init...@>=
- for j:=0 to last_ASCII_code do xchr[j]:=' ';
- xchr["."]:='.';@/
- xchr["0"]:='0'; xchr["1"]:='1'; xchr["2"]:='2'; xchr["3"]:='3';
- xchr["4"]:='4'; xchr["5"]:='5'; xchr["6"]:='6'; xchr["7"]:='7';
- xchr["8"]:='8'; xchr["9"]:='9';@/
- xchr["A"]:='A'; xchr["B"]:='B'; xchr["C"]:='C'; xchr["D"]:='D';
- xchr["E"]:='E'; xchr["F"]:='F'; xchr["G"]:='G'; xchr["H"]:='H';
- xchr["I"]:='I'; xchr["J"]:='J'; xchr["K"]:='K'; xchr["L"]:='L';
- xchr["M"]:='M'; xchr["N"]:='N'; xchr["O"]:='O'; xchr["P"]:='P';
- xchr["Q"]:='Q'; xchr["R"]:='R'; xchr["S"]:='S'; xchr["T"]:='T';
- xchr["U"]:='U'; xchr["V"]:='V'; xchr["W"]:='W'; xchr["X"]:='X';
- xchr["Y"]:='Y'; xchr["Z"]:='Z';@/
- xchr["a"]:='a'; xchr["b"]:='b'; xchr["c"]:='c'; xchr["d"]:='d';
- xchr["e"]:='e'; xchr["f"]:='f'; xchr["g"]:='g'; xchr["h"]:='h';
- xchr["i"]:='i'; xchr["j"]:='j'; xchr["k"]:='k'; xchr["l"]:='l';
- xchr["m"]:='m'; xchr["n"]:='n'; xchr["o"]:='o'; xchr["p"]:='p';
- xchr["q"]:='q'; xchr["r"]:='r'; xchr["s"]:='s'; xchr["t"]:='t';
- xchr["u"]:='u'; xchr["v"]:='v'; xchr["w"]:='w'; xchr["x"]:='x';
- xchr["y"]:='y'; xchr["z"]:='z';
-
- @ The following system-independent code makes the |xord| array contain a
- suitable inverse to the information in |xchr|.
-
- @d invalid_code=0 {|ASCII_code| that should not appear}
- @d tab_char=@'11 {|ord| of tab character; tab characters seem to be
- unavoidable with files from UNIX systems}
-
- @<Set init...@>=
- for i:=chr(first_text_char) to chr(last_text_char) do
- xord[i]:=invalid_code;
- for j:=0 to last_ASCII_code do xord[xchr[j]]:=j;
- xord[' ']:=" "; xord[chr(tab_char)]:=" ";
-
- @ So far each invalid |ASCII_code| has been assigned the character |' '|
- and all invalid characters have been assigned |ASCII_code=invalid_code|.
- The |get_ASCII| function, used only while reading the |translate| file,
- returns the |ASCII_code| corresponding to a character, assigning a new
- |ASCII_code| first if necessary.
-
- @d num_ASCII_codes=last_ASCII_code+1 {number of different |ASCII_code|
- values}
-
- @p function get_ASCII(@!c:text_char):ASCII_code;
- label found;
- var i: ASCII_code;
- begin i:=xord[c];
- if i=invalid_code then
- begin while i<last_ASCII_code do
- begin incr(i);
- if (xchr[i]=' ')and(i<>" ") then goto found;
- end;
- overflow(num_ASCII_codes:1,' characters');
- found: xord[c]:=i; xchr[i]:=c;
- end;
- get_ASCII:=i;
- end;
-
- @ The \TeX 82 hyphenation algorithm operates on `hyphenable words'
- converted temporarily to lower case, i.e., they may consist of up to
- 255 different `letters' corresponding to \.{\\lccode}s |1..255|. These
- \.{\\lccode}s could, in principle, be language dependent but this might
- lead to undesirable results when hyphenating multilingual paragraphs.
- No more than 245 different letters can occur in hyphenation patterns
- since the characters |'0'..'9'| and |'.'| play a special r\^^Dole when
- reading patterns. For the purpose of this program each letter is
- represented internally by a unique |internal_code>=2| (|internal_code=1|
- is the |edge_of_word| indicator); |internal_code| values |2..127| will
- probably suffice for all practical purposes, but we allow the range
- |2..last_ASCII_code| for the sake of generality. Syntactically
- |internal_code| and |ASCII_code| are the same, we will use one or the
- other name according to the semantic context.
-
- @d edge_of_word=1 {|internal_code| for start and end of a word}
-
- @<Types...@>=
- @!internal_code=ASCII_code;
- @!packed_internal_code=packed_ASCII_code;
-
- @ Note that an |internal_code| used by this program is in general quite
- different from the |ASCII_code| (or rather \.{\\lccode}) used by \TeX
- 82. This program allows the input of characters (from the |dictionary|
- and |patterns| file) corresponding to an |internal_code| in either lower
- or upper case form; the output (to the |patout| and |pattmp| file) will
- always be in lower case form.
-
- Unfortunately there does not (yet?) exist a standardized and widely
- accepted 8-bit character set (or a unique one-to-one translation between
- such sets). On the other hand macro expansion takes place in \TeX 82
- when reading hyphenable words and when reading patterns. Thus the lower
- and upper case versions of all `letters' used by a particular language
- can (and for the sake of portability should) be represented entirely in
- terms of the standard ASCII character set; either directly as characters
- or via macros (or active characters) with or without arguments. The
- macro definitions for such a representation will in general be language
- dependent.
-
- For the purpose of this program the external representation of the lower
- and upper case version of a letter (i.e., |internal_code|) consists of a
- unique sequence of characters (or \\{ASCII\_codes}), the only restriction
- being that no such sequence must be a subsequence of an other one.
- Moreover such sequences must not start with |' '|, |'.'|, |'0'..'9'| or
- with one of the three characters (|'-'|, |'*'|, and |'.'|) representing
- hyphens in the |dictionary| file; a sequence may, however, end with a
- mandatory |' '| as, e.g., the sequence |'\ss '|.
-
- The language dependent values of \.{\\lefthyphenmin} and
- \.{\\righthyphenmin} as well as the external representation of the lower
- and upper case letters and their collating sequence are specified in the
- |translate| file, thus making any language dependent modifications of
- this program unnecessary. If the |translate| file is empty (or does not
- exist) the values \.{\\lefthyphenmin=2} and \.{\\righthyphenmin=3} and
- |internal_code| values |2..27| with the one character external
- representations |'a'..'z'| and |'A'..'Z'| will be used as defaults.
-
- Incidentally this program can be used to convert a |dictionary| and
- |patterns| file from one (``upper case'') to another (``lower case'')
- external representation of letters.
-
- @ When reading the |dictionary| (and |patterns|) file sequences of
- characters must be recognized and converted to their corresponding
- |internal_code|. This conversion is part of \.{PATGEN}s inner loop and
- @^inner loop@>
- must therefore be done as efficient as possible. Thus we will
- mostly bypass the conversion from character to |ASCII_code| and convert
- directly to the corresponding |internal_code| using the |xclass|
- and |xint| arrays. Six types of characters are distinguished by their
- |xclass|:
-
- \yskip\hang |space_class| character |' '| terminates a pattern or word.
-
- \yskip\hang |digit_class| characters |'0'..'9'| are hyphen values for a
- pattern or hyphen weights for a word; their |xint| is the corresponding
- numeric value |0..9|.
-
- \yskip\hang |hyf_class| characters (|'.'|, |'-'|, and |'*'|) are `dots'
- and indicate hyphens in a word; their |xint| is the corresponding
- numeric value |err_hyf..found_hyf|.
-
- \yskip\hang |letter_class| characters represent a letter; their |xint|
- is the corresponding |internal_code|.
-
- \yskip\hang |escape_class| characters indicate the start of a
- multi-character sequence representing a letter.
-
- \yskip\hang |invalid_class| characters should not occur except as part
- of multi-character sequences.
-
- @d space_class=0 {the character |' '|}
- @d digit_class=1 {the characters |'0'..'9'|}
- @d hyf_class=2 {the `hyphen' characters (|'.'|, |'-'|, and |'*'|)}
- @d letter_class=3 {characters representing a letter}
- @d escape_class=4 {characters that start a multi-character sequence
- representing a letter}
- @d invalid_class=5 {characters that normally should not occur}
- @#
- @d no_hyf=0 {no hyphen}
- @d err_hyf=1 {erroneous hyphen}
- @d is_hyf=2 {hyphen}
- @d found_hyf=3 {found hyphen}
-
- @<Types...@>=
- @!class_type=space_class..invalid_class; {class of a character}
- @!digit=0..9; {a hyphen weight (or word weight)}
- @!hyf_type=no_hyf..found_hyf; {type of a hyphen}
-
- @ In addition we will use the |xext|, |xdig|, and |xdot| arrays to
- convert from the internal representation to the corresponding
- characters.
-
- @<Globals...@>=
- @!xclass: array [text_char] of class_type;
- {specifies the class of a character}
- @!xint: array [text_char] of internal_code;
- {specifies the |internal_code| for a character}
- @!xdig: array [0..9] of text_char;
- {specifies conversion of output characters}
- @!xext: array [internal_code] of text_char;
- {specifies conversion of output characters}
- @!xhyf: array [err_hyf..found_hyf] of text_char;
- {specifies conversion of output characters}
-
- @ @<Set init...@>=
- for i:=chr(first_text_char) to chr(last_text_char) do
- begin xclass[i]:=invalid_class; xint[i]:=0;
- end;
- xclass[' ']:=space_class;
- for j:=0 to last_ASCII_code do xext[j]:=' ';
- xext[edge_of_word]:='.';
- for j:=0 to 9 do
- begin xdig[j]:=xchr[j+"0"];
- xclass[xdig[j]]:=digit_class; xint[xdig[j]]:=j;
- end;
- xhyf[err_hyf]:='.'; xhyf[is_hyf]:='-'; xhyf[found_hyf]:='*';
- {default representation for hyphens}
-
- @ We assume that words use only the letters |cmin+1| through |cmax|.
- This allows us to save some time on trie operations that involve
- searching for packed transitions belonging to a particular state.
-
- @d cmin=edge_of_word
-
- @<Globals...@>=
- @!cmax: internal_code; {largest |internal_code| or |ASCII_code|}
-
- @* Data structures.
- The main data structure used in this program is a dynamic packed trie.
- In fact we use two of them, one for the set of patterns selected so far,
- and one for the patterns being considered in the current pass.
-
- For a pattern $p_1\ldots p_k$, the information associated with that
- pattern is accessed by setting |@t$t_1$@>:=trie_root+@t$p_1$@>| and
- then, for |1<i<=k|, setting |@t$t_i$@>:=trie_link(@t$t_{i-1}$@>)+
- @t$p_i$@>|; the pattern information is then stored in a location addressed
- by |@t$t_k$@>|. Since all trie nodes are packed into a single array, in
- order to distinguish nodes belonging to different trie families, a special
- field is provided such that |trie_char@t$(t_i)=si(p_i)$@>| for all |i|.
-
- In addition the trie must support dynamic insertions and deletions. This
- is done by maintaining a doubly linked list of unoccupied cells and
- repacking trie families as necessary when insertions are made.
-
- Each trie node consists of three fields: the character |trie_char|, and
- the two link fields |trie_link| and |trie_back|. In addition there is a
- separate boolean array |trie_base_used|. When a node is unoccupied,
- |trie_char=min_packed| and the link fields point to the next and previous
- unoccupied nodes, respectively, in the doubly linked list. When a node is
- occupied, |trie_link| points to the next trie family, and |trie_back|
- (renamed |trie_outp|) contains the output associated with this transition.
- The |trie_base_used| bit indicates that some family has been packed at
- this base location, and is used to prevent two families from being packed
- at the same location.
-
- @ The sizes of the pattern tries may have to be adjusted depending
- on the particular application (i.e., the parameter settings and the
- size of the dictionary). The sizes below were sufficient to generate
- the original set of english \TeX 82 hyphenation patterns (file
- \.{hyphen.tex}).
-
- @<Constants...@>=
- @!trie_size=55000; {space for pattern trie}
- @!triec_size=26000; {space for pattern count trie, must be less than
- |trie_size| and greater than the number of occurrences of any pattern in
- the dictionary}
- @!max_ops=4080; {size of output hash table, should be a multiple of 510}
- @!max_val=10; {maximum number of levels$+1$, also used to denote bad
- patterns}
- @!max_dot=15; {maximum pattern length, also maximum length of external
- representation of a `letter'}
- @!max_len=50; {maximum word length}
- @!max_buf_len=80; {maximum length of input lines, must be at least
- |max_len|}
-
- @ @<Check the ``constant'' values for consistency@>=
- if (triec_size<4096)or(trie_size<triec_size) then bad:=3;
- if max_ops>trie_size then bad:=4;
- if max_val>10 then bad:=5;
- if max_buf_len<max_len then bad:=6;
-
- @ @<Types...@>=
- @!q_index=1..last_ASCII_code; {number of transitions in a state}
- @!val_type=0..max_val; {hyphenation values}
- @!dot_type=0..max_dot; {dot positions}
- @!op_type=0..max_ops; {index into output hash table}
- @!word_index=0..max_len; {index into |word|}
- @!trie_pointer=0..trie_size;
- @!triec_pointer=0..triec_size;@/
- @!op_word=packed record dot: dot_type; val: val_type; op: op_type end;
-
- @ Trie is actually stored with its components in separate packed arrays,
- in order to save space and time (although this depends on the computer's
- word size and the size of the trie pointers).
-
- @<Globals...@>=
- @!trie_c: packed array[trie_pointer] of packed_internal_code;
- @!trie_l, @!trie_r: packed array[trie_pointer] of trie_pointer;
- @!trie_taken: packed array[trie_pointer] of boolean;
- @!triec_c: packed array[triec_pointer] of packed_internal_code;
- @!triec_l, @!triec_r: packed array[triec_pointer] of triec_pointer;
- @!triec_taken: packed array[triec_pointer] of boolean;
- @!ops: array[op_type] of op_word; {output hash table}
-
- @ When some trie state is being worked on, an unpacked version of the
- state is kept in positions |1..qmax| of the global arrays |trieq_c|,
- |trieq_l|, and |trieq_r|. The character fields need not be in any
- particular order.
-
- @<Globals...@>=
- @!trieq_c: array[q_index] of internal_code; {character fields of a
- single trie state}
- @!trieq_l, @!trieq_r: array[q_index] of trie_pointer; {link fields}
- @!qmax: q_index; {number of transitions in an unpacked state}
- @!qmax_thresh: q_index; {controls density of first-fit packing}
-
- @ Trie fields are accessed using the following macros.
-
- @d trie_char(#)==trie_c[#]
- @d trie_link(#)==trie_l[#]
- @d trie_back(#)==trie_r[#]
- @d trie_outp(#)==trie_r[#]
- @d trie_base_used(#)==trie_taken[#]
- @#
- @d triec_char(#)==triec_c[#]
- @d triec_link(#)==triec_l[#]
- @d triec_back(#)==triec_r[#]
- @d triec_good(#)==triec_l[#]
- @d triec_bad(#)==triec_r[#]
- @d triec_base_used(#)==triec_taken[#]
- @#
- @d q_char(#)==trieq_c[#]
- @d q_link(#)==trieq_l[#]
- @d q_back(#)==trieq_r[#]
- @d q_outp(#)==trieq_r[#]
- @#
- @d hyf_val(#)==ops[#].val
- @d hyf_dot(#)==ops[#].dot
- @d hyf_nxt(#)==ops[#].op
-
- @* Routines for pattern trie.
- The pattern trie holds the set of patterns chosen prior to the current
- pass, including bad or ``hopeless'' patterns at the current level that
- occur too few times in the dictionary to be of use. Each transition of
- the trie includes an output field pointing to the hyphenation information
- associated with this transition.
-
- @<Globals...@>=
- @!trie_max: trie_pointer; {maximum occupied trie node}
- @!trie_bmax: trie_pointer; {maximum base of trie family}
- @!trie_count: trie_pointer; {number of occupied trie nodes, for space usage
- statistics}
- @!op_count: op_type; {number of outputs in hash table}
-
- @ Initially, the dynamic packed trie has just one state, namely the root,
- with all transitions present (but with null links). This is convenient
- because the root will never need to be repacked and also we won't have to
- check that the base is nonnegative when packing other states.
- Moreover in many cases we need not check for a vanishing link field:
- if |trie_link(t)=0| then a subsequent test for
- |trie_char(trie_link(t)+c)=si(c)| will always fail due to |trie_root=1|.
-
- @d trie_root=1
-
- @p procedure init_pattern_trie;
- var c: internal_code; @!h: op_type;
- begin for c:=0 to last_ASCII_code do
- begin trie_char(trie_root+c):=si(c); {indicates node occupied;
- fake for |c=0|}
- trie_link(trie_root+c):=0;
- trie_outp(trie_root+c):=0;
- trie_base_used(trie_root+c):=false;
- end;
- trie_base_used(trie_root):=true;
- trie_bmax:=trie_root;
- trie_max:=trie_root+last_ASCII_code;
- trie_count:=num_ASCII_codes;@/
- qmax_thresh:=5;@/
- trie_link(0):=trie_max+1;
- trie_back(trie_max+1):=0;@/
- {|trie_link(0)| is used as the head of the doubly linked list of
- unoccupied cells}
- for h:=1 to max_ops do hyf_val(h):=0; {clear output hash table}
- op_count:=0;
- end;
-
- @ The |first_fit| procedure finds a hole in the packed trie into which the
- state in |trieq_c|, |trieq_l|, and |trieq_r| will fit. This is normally
- done by going through the linked list of unoccupied cells and testing if
- the state will fit at each position. However if a state has too many
- transitions (and is therefore unlikely to fit among existing
- transitions) we don't bother and instead just pack it immediately to the
- right of the occupied region (starting at |trie_max+1|).
-
- @p function first_fit: trie_pointer;
- label found, not_found;
- var s, @!t: trie_pointer; @!q: q_index;
- begin @<Set |s| to the trie base location at which this state should be
- packed@>;
- for q:=1 to qmax do {pack it}
- begin t:=s+q_char(q);@/
- trie_link(trie_back(t)):=trie_link(t);
- trie_back(trie_link(t)):=trie_back(t); {link around
- filled cell}
- trie_char(t):=si(q_char(q));
- trie_link(t):=q_link(q);
- trie_outp(t):=q_outp(q);
- if t>trie_max then trie_max:=t;
- end;
- trie_base_used(s):=true;
- first_fit:=s
- end;
-
- @ The threshold for large states is initially 5 transitions. If more than
- one level of patterns is being generated, the threshold is set to 7 on
- subsequent levels because the pattern trie will be sparser after bad
- patterns are deleted (see |delete_bad_patterns|).
-
- @<Set |s| to the trie base location at which this state should be packed@>=
- if qmax>qmax_thresh then t:=trie_back(trie_max+1) @+else t:=0;
- loop begin t:=trie_link(t); s:=t-q_char(1); {get next unoccupied cell}
- @<Ensure |trie| linked up to |s+num_ASCII_codes|@>;
- if trie_base_used(s) then goto not_found;
- for q:=qmax downto 2 do {check if state fits here}
- if trie_char(s+q_char(q))<>min_packed then goto not_found;
- goto found;
- not_found: end;
- found:
-
- @ The trie is only initialized (as a doubly linked list of empty cells) as
- far as necessary. Here we extend the initialization if necessary, and
- check for overflow.
-
- @<Ensure |trie| linked up to |s+num_ASCII_codes|@>=
- if s>trie_size-num_ASCII_codes then
- overflow(trie_size:1,' pattern trie nodes');
- while trie_bmax<s do
- begin incr(trie_bmax);
- trie_base_used(trie_bmax):=false;
- trie_char(trie_bmax+last_ASCII_code):=min_packed;
- trie_link(trie_bmax+last_ASCII_code):=trie_bmax+num_ASCII_codes;
- trie_back(trie_bmax+num_ASCII_codes):=trie_bmax+last_ASCII_code;
- end
-
- @ The |unpack| procedure finds all transitions associated with the state
- with base |s|, puts them into the arrays |trieq_c|, |trieq_l|, and
- |trieq_r|, and sets |qmax| to one more than the number of transitions
- found. Freed cells are put at the beginning of the free list.
-
- @p procedure unpack(@!s: trie_pointer);
- var c: internal_code; @!t: trie_pointer;
- begin qmax:=1;
- for c:=cmin to cmax do {search for transitions belonging to this state}
- begin t:=s+c;
- if so(trie_char(t))=c then {found one}
- begin q_char(qmax):=c;
- q_link(qmax):=trie_link(t);
- q_outp(qmax):=trie_outp(t);
- incr(qmax);@/
- {now free trie node}
- trie_back(trie_link(0)):=t;
- trie_link(t):=trie_link(0);
- trie_link(0):=t;
- trie_back(t):=0;
- trie_char(t):=min_packed;
- end;
- end;
- trie_base_used(s):=false;
- end;
-
- @ The function |new_trie_op| returns the `opcode' for the output
- consisting of hyphenation value~|v|, hyphen position |d|, and next output
- |n|. The hash function used by |new_trie_op| is based on the idea that
- 313/510 is an approximation to the golden ratio [cf.\ {\sl The Art of
- Computer Programming \bf3} (1973), 510--512]; but the choice is
- comparatively unimportant in this particular application.
-
- @p function new_trie_op(@!v: val_type; @!d: dot_type; @!n: op_type):
- op_type;
- label exit;
- var h: op_type;
- begin h:=((n+313*d+361*v) mod max_ops)+1; {trial hash location}
- loop begin if hyf_val(h)=0 then {empty position found}
- begin incr(op_count);
- if op_count=max_ops then overflow(max_ops:1,' outputs');
- hyf_val(h):=v; hyf_dot(h):=d; hyf_nxt(h):=n; new_trie_op:=h; return;
- end;
- if (hyf_val(h)=v) and (hyf_dot(h)=d) and
- (hyf_nxt(h)=n) then {already in hash table}
- begin new_trie_op:=h; return;
- end;
- if h>1 then decr(h) @+else h:=max_ops; {try again}
- end;
- exit: end;
-
- @ @<Globals...@>=
- @!pat: array[dot_type] of internal_code; {current pattern}
- @!pat_len: dot_type; {pattern length}
-
- @ Now that we have provided the necessary routines for manipulating the
- dynamic packed trie, here is a procedure that inserts a pattern of length
- |pat_len|, stored in the |pat| array, into the pattern trie. It also adds
- a new output.
-
- @p procedure insert_pattern(@!val: val_type; @!dot: dot_type);
- var i: dot_type; @!s, @!t: trie_pointer;
- begin i:=1;
- s:=trie_root+pat[i]; t:=trie_link(s);
- while (t>0) and (i<pat_len) do {follow existing trie}
- begin incr(i); Incr(t)(pat[i]);
- if so(trie_char(t))<>pat[i] then
- @<Insert critical transition, possibly repacking@>;
- s:=t; t:=trie_link(s);
- end;
- q_link(1):=0; q_outp(1):=0; qmax:=1;
- while i<pat_len do {insert rest of pattern}
- begin incr(i); q_char(1):=pat[i];
- t:=first_fit;
- trie_link(s):=t;
- s:=t+pat[i];
- incr(trie_count);
- end;
- trie_outp(s):=new_trie_op(val,dot,trie_outp(s));
- end;
-
- @ We have accessed a transition not in the trie. We insert it, repacking
- the state if necessary.
-
- @<Insert critical transition, possibly repacking@>=
- begin if trie_char(t)=min_packed then
- begin {we're lucky, no repacking needed}
- trie_link(trie_back(t)):=trie_link(t);
- trie_back(trie_link(t)):=trie_back(t);@/
- trie_char(t):=si(pat[i]);
- trie_link(t):=0;
- trie_outp(t):=0;
- if t>trie_max then trie_max:=t;
- end
- else begin {whoops, have to repack}
- unpack(t-pat[i]);@/
- q_char(qmax):=pat[i];
- q_link(qmax):=0;
- q_outp(qmax):=0;@/
- t:=first_fit;
- trie_link(s):=t;
- Incr(t)(pat[i]);
- end;
- incr(trie_count);
- end
-
- @* Routines for pattern count trie.
- The pattern count trie is used to store the set of patterns considered in
- the current pass, along with the counts of good and bad instances. The
- fields of this trie are the same as the pattern trie, except that there is
- no output field, and leaf nodes are also used to store counts
- (|triec_good| and |triec_bad|). Except where noted, the following
- routines are analogous to the pattern trie routines.
-
- @<Globals...@>=
- @!triec_max, @!triec_bmax, @!triec_count: triec_pointer; {same as for
- pattern trie}
- @!triec_kmax: triec_pointer; {shows growth of trie during pass}
- @!pat_count: integer; {number of patterns in count trie}
-
- @ [See |init_pattern_trie|.] The variable |triec_kmax| always contains
- the size of the count trie rounded up to the next multiple of 4096, and is
- used to show the growth of the trie during each pass.
-
- @d triec_root=1
-
- @p procedure init_count_trie;
- var c: internal_code;
- begin for c:=0 to last_ASCII_code do
- begin triec_char(triec_root+c):=si(c);@/
- triec_link(triec_root+c):=0;
- triec_back(triec_root+c):=0;
- triec_base_used(triec_root+c):=false;
- end;
- triec_base_used(triec_root):=true;
- triec_bmax:=triec_root; triec_max:=triec_root+last_ASCII_code;
- triec_count:=num_ASCII_codes; triec_kmax:=4096;@/
- triec_link(0):=triec_max+1; triec_back(triec_max+1):=0;@/
- pat_count:=0;
- end;
-
- @ [See |first_fit|.]
-
- @p function firstc_fit: triec_pointer;
- label found, not_found;
- var a, @!b: triec_pointer; @!q: q_index;
- begin @<Set |b| to the count trie base location at which this state should
- be packed@>;
- for q:=1 to qmax do {pack it}
- begin a:=b+q_char(q);@/
- triec_link(triec_back(a)):=triec_link(a);
- triec_back(triec_link(a)):=triec_back(a);@/
- triec_char(a):=si(q_char(q));
- triec_link(a):=q_link(q);
- triec_back(a):=q_back(q);
- if a>triec_max then triec_max:=a;
- end;
- triec_base_used(b):=true;
- firstc_fit:=b
- end;
-
- @ The threshold for attempting a first-fit packing is 3 transitions, which
- is lower than for the pattern trie because speed is more important here.
-
- @<Set |b| to the count trie base location...@>=
- if qmax>3 then a:=triec_back(triec_max+1) @+else a:=0;
- loop begin a:=triec_link(a); b:=a-q_char(1);@/
- @<Ensure |triec| linked up to |b+num_ASCII_codes|@>;
- if triec_base_used(b) then goto not_found;
- for q:=qmax downto 2 do
- if triec_char(b+q_char(q))<>min_packed then goto not_found;
- goto found;
- not_found: end;
- found:
-
- @ @<Ensure |triec| linked up to |b+num_ASCII_codes|@>=
- if b>triec_kmax-num_ASCII_codes then
- begin if triec_kmax=triec_size then
- overflow(triec_size:1,' count trie nodes');
- print(triec_kmax div 1024:1, 'K ');
- if triec_kmax>triec_size-4096 then triec_kmax:=triec_size
- else Incr(triec_kmax)(4096);
- end;
- while triec_bmax<b do
- begin incr(triec_bmax);
- triec_base_used(triec_bmax):=false;
- triec_char(triec_bmax+last_ASCII_code):=min_packed;
- triec_link(triec_bmax+last_ASCII_code):=triec_bmax+num_ASCII_codes;
- triec_back(triec_bmax+num_ASCII_codes):=triec_bmax+last_ASCII_code;
- end
-
- @ [See |unpack|.]
-
- @p procedure unpackc(@!b: triec_pointer);
- var c: internal_code; @!a: triec_pointer;
- begin qmax:=1;
- for c:=cmin to cmax do {search for transitions belonging to this state}
- begin a:=b+c;
- if so(triec_char(a))=c then {found one}
- begin q_char(qmax):=c;
- q_link(qmax):=triec_link(a);
- q_back(qmax):=triec_back(a);
- incr(qmax);@/
- triec_back(triec_link(0)):=a;
- triec_link(a):=triec_link(0);
- triec_link(0):=a; triec_back(a):=0;
- triec_char(a):=min_packed;
- end;
- end;
- triec_base_used(b):=false;
- end;
-
- @ [See |insert_pattern|.] Patterns being inserted into the count trie are
- always substrings of the current word, so they are contained in the array
- |word| with length |pat_len| and finishing position |fpos|.
-
- @p function insertc_pat(@!fpos: word_index): triec_pointer;
- var spos: word_index; @!a, @!b: triec_pointer;
- begin spos:=fpos-pat_len; {starting position of pattern}
- incr(spos); b:=triec_root+word[spos]; a:=triec_link(b);
- while (a>0) and (spos<fpos) do {follow existing trie}
- begin incr(spos); Incr(a)(word[spos]);
- if so(triec_char(a))<>word[spos] then
- @<Insert critical count transition, possibly repacking@>;
- b:=a; a:=triec_link(a);
- end;
- q_link(1):=0; q_back(1):=0; qmax:=1;
- while spos<fpos do {insert rest of pattern}
- begin incr(spos); q_char(1):=word[spos];
- a:=firstc_fit;
- triec_link(b):=a;
- b:=a+word[spos];
- incr(triec_count);
- end;
- insertc_pat:=b;
- incr(pat_count);
- end;
-
- @ @<Insert critical count transition, possibly repacking@>=
- begin if triec_char(a)=min_packed then {lucky}
- begin triec_link(triec_back(a)):=triec_link(a);
- triec_back(triec_link(a)):=triec_back(a);
- triec_char(a):=si(word[spos]);
- triec_link(a):=0;
- triec_back(a):=0;
- if a>triec_max then triec_max:=a;
- end
- else begin {have to repack}
- unpackc(a-word[spos]);@/
- q_char(qmax):=word[spos];
- q_link(qmax):=0;
- q_back(qmax):=0;
- a:=firstc_fit;
- triec_link(b):=a;
- Incr(a)(word[spos]);
- end;
- incr(triec_count);
- end
-
- @* Input and output.
- For some \PASCAL\ systems output files must be closed before the program
- terminates; it may also be necessary to close input files. Since
- standard \PASCAL\ does not provide for this, we use \.{WEB} macros and
- will say |close_out(f)| resp.\ |close_in(f)|; these macros should not
- produce errors or system messages, even if a file could not be opened
- successfully.
-
- @d close_out(#)==close(#) {close an output file}
- @d close_in(#)==do_nothing {close an input file}
-
- @<Globals...@>=
- @!dictionary, @!patterns, @!translate, @!patout, @!pattmp: text_file;
-
- @ When reading a line from one of the input files (|dictionary|,
- |patterns|, or |translate|) the characters read from that line (padded
- with blanks if necessary) are to be placed into the |buf| array. Reading
- lines from the |dictionary| file should be as efficient as possible
- since this is part of \.{PATGEN}'s ``inner loop''. Standard \PASCAL,
- unfortunately, does not provide for this; consequently the \.{WEB} macro
- |read_buf| defined below should be optimized if possible. For many
- \PASCAL's this can be done with |read_ln(f,buf)| where |buf| is declared
- as \PASCAL\ string (i.e., as \&{packed} \&{array} |[1..any]| \&{of}
- |char|), for others a string type with dynamic length can be used.
- @^inner loop@>
-
- @d read_buf(#)== {reads a line from input file |#| into |buf| array}
- begin buf_ptr:=0;
- while not eoln(#) do
- begin if (buf_ptr>=max_buf_len) then bad_input('Line too long');
- @.Line too long@>
- incr(buf_ptr); read(#,buf[buf_ptr]);
- end;
- read_ln(#);
- while buf_ptr<max_buf_len do
- begin incr(buf_ptr); buf[buf_ptr]:=' ';
- end;
- end
-
- @<Globals...@>=
- @!buf: array[1..max_buf_len] of text_char; {array to hold lines of input}
- @!buf_ptr: 0..max_buf_len; {index into |buf|}
-
- @ When an error is caused by bad input data we say |bad_input(#)| in
- order to disply the contents of the |buf| array before terminating with
- an error message.
-
- @d print_buf== {print contents of |buf| array}
- begin buf_ptr:=0;
- repeat incr(buf_ptr); print(buf[buf_ptr]);
- until buf_ptr=max_buf_len;
- print_ln(' ');
- end
- @d bad_input(#)==begin print_buf; error(#); end
-
- @ The |translate| file may specify the values of \.{\\lefthyphenmin} and
- \.{\\righthyphenmin} as well as the external representation and
- collating sequence of the `letters' used by the language. In addition
- replacements may be specified for the characters |'-'|, |'*'|, and |'.'|
- representing hyphens in the word list. If the |translate| file is empty
- (or does not exist) default values will be used.
-
- @p procedure read_translate;
- label done;
- var c: text_char;
- @!n: integer;
- @!j: ASCII_code;
- @!bad: boolean;
- @!lower: boolean;
- @!i: dot_type; @!s, @!t: trie_pointer;
- begin imax:=edge_of_word;
- reset(translate);
- if eof(translate) then
- @<Set up default character translation tables@>
- else begin read_buf(translate); @<Set up hyphenation data@>;
- cmax:=last_ASCII_code-1;
- while not eof(translate) do @<Set up representation(s) for a letter@>;
- end;
- close_in(translate);
- print_ln('left_hyphen_min = ',left_hyphen_min:1,
- ', right_hyphen_min = ',right_hyphen_min:1,
- ', ',imax-edge_of_word:1,' letters');
- cmax:=imax;
- end;
-
- @ @<Globals...@>=
- @!imax: internal_code; {largest |internal_code| assigned so far}
- @!left_hyphen_min, @!right_hyphen_min: dot_type;
-
- @ @<Set up default...@>=
- begin left_hyphen_min:=2; right_hyphen_min:=3;
- for j:="A" to "Z" do
- begin incr(imax);
- c:=xchr[j+"a"-"A"]; xclass[c]:=letter_class; xint[c]:=imax;
- xext[imax]:=c;
- c:=xchr[j]; xclass[c]:=letter_class; xint[c]:=imax;
- end;
- end
-
- @ The first line of the |translate| file must contain the values
- of \.{\\lefthyphenmin} and \.{\\righthyphenmin} in columns 1--2 and
- 3--4. In addition columns~5, 6, and~7 may (optionally) contain
- replacements for the default characters |'.'|, |'-'|, and |'*'|
- respectively, representing hyphens in the word list.
- If the values specified for \.{\\lefthyphenmin} and \.{\\righthyphenmin}
- are invalid (e.g., blank) new values are read from the terminal.
-
- @<Set up hyphenation...@>=
- bad:=false;
- if buf[1]=' ' then n:=0
- else if xclass[buf[1]]=digit_class then n:=xint[buf[1]]@+
- else bad:=true;
- if xclass[buf[2]]=digit_class then n:=10*n+xint[buf[2]]@+
- else bad:=true;
- if (n>=1)and(n<max_dot) then left_hyphen_min:=n@+else bad:=true;
- if buf[3]=' ' then n:=0
- else if xclass[buf[3]]=digit_class then n:=xint[buf[3]]@+
- else bad:=true;
- if xclass[buf[4]]=digit_class then n:=10*n+xint[buf[4]]@+
- else bad:=true;
- if (n>=1)and(n<max_dot) then right_hyphen_min:=n@+
- else bad:=true;
- if bad then
- begin bad:=false;
- repeat print('left_hyphen_min, right_hyphen_min: '); get_input(n1,n2);@/
- if (n1>=1)and(n1<max_dot)and(n2>=1)and(n2<max_dot) then
- begin left_hyphen_min:=n1; right_hyphen_min:=n2;
- end
- else begin n1:=0;
- print_ln('Specify 1<=left_hyphen_min,right_hyphen_min<=',
- max_dot-1:1,' !');
- end;
- until n1>0;
- end;
- for j:=err_hyf to found_hyf do
- begin if buf[j+4]<>' ' then xhyf[j]:=buf[j+4];
- if xclass[xhyf[j]]=invalid_class then xclass[xhyf[j]]:=hyf_class@+
- else bad:=true;
- end;
- xclass['.']:=hyf_class; {in case the default has been changed}
- if bad then bad_input('Bad hyphenation data')
- @.Bad hyphenation data@>
-
- @ Each following line is either a comment or specifies the external
- representations for one `letter' used by the language. Comment lines
- start with two equal characters (e.g., are blank) and are ignored.
- Other lines contain the external representation of the lower case
- version and an arbitrary number of `upper case versions' of a letter
- preceded and separated by a delimiter and followed by two consecutive
- delimiters; the delimiter may be any character not occuring in either
- version.
-
- @<Set up repres...@>=
- begin read_buf(translate); buf_ptr:=1; lower:=true;
- while not bad do {lower and then upper case version}
- begin pat_len:=0;
- repeat if buf_ptr<max_buf_len then incr(buf_ptr) @+ else bad:=true;
- if buf[buf_ptr]=buf[1] then
- if pat_len=0 then goto done
- else begin if lower then
- begin if imax=last_ASCII_code then
- begin print_buf; overflow(num_ASCII_codes:1,' letters');
- end;
- incr(imax); xext[imax]:=xchr[pat[pat_len]];
- end;
- c:=xchr[pat[1]];
- if pat_len=1 then
- begin if xclass[c]<>invalid_class then bad:=true;
- xclass[c]:=letter_class; xint[c]:=imax;
- end
- else @<Insert a letter into pattern trie@>;
- end
- else if pat_len=max_dot then bad:=true
- else begin incr(pat_len); pat[pat_len]:=get_ASCII(buf[buf_ptr]);
- end;
- until (buf[buf_ptr]=buf[1])or bad;
- lower:=false;
- end;
- done: if bad then bad_input('Bad representation');
- @.Bad representation@>
- end
-
- @ When the (lower or upper case) external representation of a letter
- consists of more than one character and the corresponding |ASCII_code|
- values have been placed into the |pat| array we store them in
- the pattern trie. [See |insert_pattern|.] Since this `external subtrie'
- starts at |trie_link(trie_root)| it does not interfere with normal
- patterns. The output field of leaf nodes contains the |internal_code|
- and the link field distinguishes between lower and upper case letters.
-
- @<Insert a letter...@>=
- begin if xclass[c]=invalid_class then xclass[c]:=escape_class;
- if xclass[c]<>escape_class then bad:=true;
- i:=0; s:=trie_root; t:=trie_link(s);
- while (t>trie_root) and (i<pat_len) do {follow existing trie}
- begin incr(i); Incr(t)(pat[i]);
- if so(trie_char(t))<>pat[i] then
- @<Insert critical transition, possibly repacking@>
- else if trie_outp(t)>0 then bad:=true;
- s:=t; t:=trie_link(s);
- end;
- if t>trie_root then bad:=true;
- q_link(1):=0; q_outp(1):=0; qmax:=1;
- while i<pat_len do {insert rest of pattern}
- begin incr(i); q_char(1):=pat[i];
- t:=first_fit;
- trie_link(s):=t;
- s:=t+pat[i];
- incr(trie_count);
- end;
- trie_outp(s):=imax;
- if not lower then trie_link(s):=trie_root;
- end
-
- @ The |get_letter| \.{WEB} macro defined here will be used in
- |read_word| and |read_patterns| to obtain the |internal_code|
- corresponding to a letter externally represented by a multi-character
- sequence (starting with an |escape_class| character).
-
- @d get_letter(#)==
- begin t:=trie_root;
- loop begin t:=trie_link(t)+xord[c];
- if so(trie_char(t))<>xord[c] then bad_input('Bad representation');
- @.Bad representation@>
- if trie_outp(t)<>0 then
- begin #:=trie_outp(t); goto done;
- end;
- if buf_ptr=max_buf_len then c:=' '
- else begin incr(buf_ptr); c:=buf[buf_ptr];
- end;
- end;
- done: end
-
- @ In order to prepare for the output phase we store all but the last of
- the \\{ASCII\_codes} of the external representation of each `lower case
- letter' in the pattern count trie which is no longer used at that time.
- The recursive |find_letters| procedure traverses the `external subtrie'.
-
- @p procedure find_letters(@!b: trie_pointer; @!i: dot_type);@/
- {traverse subtries of family |b|; |i| is current depth in trie}
- var c: ASCII_code; {a local variable that must be saved on recursive calls}
- @!a: trie_pointer; {does not need to be saved}
- @!j: dot_type; {loop index}
- @!l: triec_pointer;
- begin if i=1 then init_count_trie;
- for c:=cmin to last_ASCII_code do {find transitions belonging to this
- family}
- begin a:=b+c;
- if so(trie_char(a))=c then {found one}
- begin pat[i]:=c;
- if trie_outp(a)=0 then find_letters(trie_link(a),i+1)
- else if trie_link(a)=0 then {this is a lower case letter}
- @<Insert external representation for a letter into count trie@>;
- end;
- end;
- end;
-
- @ Starting from |triec_root+trie_outp(a)| we proceed through link fields
- and store all \\{ASCII\_codes} except the last one in the count trie;
- the last character has already been stored in the |xext| array.
-
- @<Insert external...@>=
- begin l:=triec_root+trie_outp(a);
- for j:=1 to i-1 do
- begin if triec_max=triec_size then
- overflow(triec_size:1,' count trie nodes');
- incr(triec_max); triec_link(l):=triec_max; l:=triec_max;
- triec_char(l):=si(pat[j]);
- end;
- triec_link(l):=0;
- end
-
- @ During the output phase we will say |write_letter(i)(f)| and
- |write(f,xext[i])| to write the lower case external representation of
- the letter with internal code |i| to file |f|: |xext[i]| is the last
- character of the external representation whereas the \.{WEB} macro
- |write_letter| defined here writes all preceding characters (if any).
-
- @d write_letter_end(#)==while l>0 do
- begin write(#,xchr[so(triec_char(l))]); l:=triec_link(l);
- end
- @d write_letter(#)==l:=triec_link(triec_root+#); write_letter_end
-
- @* Routines for traversing pattern tries.
- At the end of a pass, we traverse the count trie using the following
- recursive procedure, selecting good and bad patterns and inserting them
- into the pattern trie.
-
- @p procedure traverse_count_trie(@!b: triec_pointer; @!i: dot_type);@/
- {traverse subtries of family |b|; |i| is current depth in trie}
- var c: internal_code; {a local variable that must be saved on recursive
- calls}
- @!a: triec_pointer; {does not need to be saved}
- begin
- for c:=cmin to cmax do {find transitions belonging to this family}
- begin a:=b+c;
- if so(triec_char(a))=c then {found one}
- begin pat[i]:=c;
- if i<pat_len then traverse_count_trie(triec_link(a),i+1)
- else @<Decide what to do with this pattern@>;
- end;
- end;
- end;
-
- @ When we have come to the end of a pattern, |triec_good(a)| and
- |triec_bad(a)| contain the number of times this pattern helps or hinders
- the cause. We use the counts to determine if this pattern should be
- selected, or if it is hopeless, or if we can't decide yet. In the latter
- case, we set |more_to_come| true to indicate that there might still be
- good patterns extending the current type of patterns.
-
- @<Decide what to do...@>=
- if good_wt*triec_good(a)<thresh then {hopeless pattern}
- begin insert_pattern(max_val,pat_dot);
- incr(bad_pat_count)
- end else
- if good_wt*triec_good(a)-bad_wt*triec_bad(a)>=thresh then {good pattern}
- begin insert_pattern(hyph_level,pat_dot);
- incr(good_pat_count);
- Incr(good_count)(triec_good(a));
- Incr(bad_count)(triec_bad(a));
- end else
- more_to_come:=true
-
- @ Some global variables are used to accumulate statistics about the
- performance of a pass.
-
- @<Globals...@>=
- @!good_pat_count, @!bad_pat_count: integer; {number of patterns added at end
- of pass}
- @!good_count, @!bad_count, @!miss_count: integer; {hyphen counts}
- @!level_pattern_count: integer; {number of good patterns at level}
- @!more_to_come: boolean;
-
- @ The recursion in |traverse_count_trie| is initiated by the following
- procedure, which also prints some statistics about the patterns chosen.
- The ``efficiency'' is an estimate of pattern effectiveness.
-
- @d bad_eff==(thresh/good_wt)
-
- @p procedure collect_count_trie;
- begin good_pat_count:=0; bad_pat_count:=0;
- good_count:=0; bad_count:=0;
- more_to_come:=false;
- traverse_count_trie(triec_root,1); @/
- print(good_pat_count:1,' good and ',
- bad_pat_count:1,' bad patterns added');
- Incr(level_pattern_count)(good_pat_count);
- if more_to_come then print_ln(' (more to come)') @+else print_ln(' ');
- print('finding ',good_count:1,' good and ',bad_count:1,' bad hyphens');
- if good_pat_count>0 then
- print_ln(', efficiency = ',
- good_count/(good_pat_count+bad_count/bad_eff):1:2)
- else print_ln(' ');
- print_ln('pattern trie has ',trie_count:1,' nodes, ',@|
- 'trie_max = ',trie_max:1,', ',op_count:1,' outputs');
- end;
-
- @ At the end of a level, we traverse the pattern trie and delete bad
- patterns by removing their outputs. If no output remains, the node is
- also deleted.
-
- @p function delete_patterns(@!s: trie_pointer): trie_pointer;@/
- {delete bad patterns in subtrie |s|, return 0 if entire subtrie freed,
- otherwise |s|}
- var c: internal_code; @!t: trie_pointer; @!all_freed: boolean;
- {must be saved on recursive calls}
- @!h, @!n: op_type; {do not need to be saved}
- begin all_freed:=true;
- for c:=cmin to cmax do {find transitions belonging to this family}
- begin t:=s+c;
- if so(trie_char(t))=c then
- begin @<Link around bad outputs@>;
- if trie_link(t)>0 then
- trie_link(t):=delete_patterns(trie_link(t));
- if (trie_link(t)>0) or (trie_outp(t)>0) or (s=trie_root) then
- all_freed:=false
- else
- @<Deallocate this node@>;
- end;
- end;
- if all_freed then {entire state is freed}
- begin trie_base_used(s):=false;
- s:=0;
- end;
- delete_patterns:=s;
- end;
-
- @ @<Link around bad outputs@>=
- begin h:=0;
- hyf_nxt(0):=trie_outp(t);
- n:=hyf_nxt(0);
- while n>0 do
- begin if hyf_val(n)=max_val then hyf_nxt(h):=hyf_nxt(n)
- else h:=n;
- n:=hyf_nxt(h);
- end;
- trie_outp(t):=hyf_nxt(0);
- end
-
- @ Cells freed by |delete_patterns| are put at the end of the free list.
-
- @<Deallocate this node@>=
- begin trie_link(trie_back(trie_max+1)):=t;
- trie_back(t):=trie_back(trie_max+1);
- trie_link(t):=trie_max+1;
- trie_back(trie_max+1):=t;
- trie_char(t):=min_packed;@/
- decr(trie_count);
- end
-
- @ The recursion in |delete_patterns| is initiated by the following
- procedure, which also prints statistics about the number of nodes deleted,
- and zeros bad outputs in the hash table. Note that the hash table may
- become somewhat disorganized when more levels are added, but this defect
- isn't serious.
-
- @p procedure delete_bad_patterns;
- v...
[truncated message content] |
|
From: <vic...@us...> - 2007-09-10 18:01:29
|
Revision: 10202
http://foray.svn.sourceforge.net/foray/?rev=10202&view=rev
Author: victormote
Date: 2007-09-10 11:01:28 -0700 (Mon, 10 Sep 2007)
Log Message:
-----------
1. Add patgen code converted from Pascal to C by Amadeus Foudray to the repository for short-term reference.
2. Add constants and variables to the java code for the PatternGenerator.
3. Suppress checkstyle warnings in PatternGenerator for now.
Modified Paths:
--------------
trunk/foray/foray-hyphen/src/java/org/foray/hyphen/PatternGenerator.java
trunk/foray/scripts/checkstyle-suppressions.xml
Added Paths:
-----------
trunk/foray/foray-hyphen/src/java/org/foray/hyphen/patgen.c
Modified: trunk/foray/foray-hyphen/src/java/org/foray/hyphen/PatternGenerator.java
===================================================================
--- trunk/foray/foray-hyphen/src/java/org/foray/hyphen/PatternGenerator.java 2007-09-06 22:56:55 UTC (rev 10201)
+++ trunk/foray/foray-hyphen/src/java/org/foray/hyphen/PatternGenerator.java 2007-09-10 18:01:28 UTC (rev 10202)
@@ -53,6 +53,9 @@
package org.foray.hyphen;
+import org.foray.common.WKConstants;
+
+import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
@@ -96,7 +99,149 @@
* state machine. For further details, see the \TeX 82 source.</p>
*/
public class PatternGenerator {
+ /* TODO: Turn off checkstyle suppression for this class. */
+ /* Start of Constants from TeX code. */
+ /* TODO: After conversion is complete, make all of these constants and
+ * variables private. */
+
+ /** Maximum number of levels + 1. Also used to denote bad patterns. */
+ static final byte MAX_VAL = 10;
+
+ /** A magic number for the size of the array holding the filename. */
+ private static final int FILENAME_SIZE = 9;
+
+ /** A magic number of unknown significance. */
+ private static final int HYPH_SIZE = 4;
+
+ /** A magic number of unknown significance. */
+ private static final int DIGITS_SIZE = 10;
+
+ /** Space for pattern trie. */
+ private static final int TRIE_SIZE = 550000;
+
+ /** Space for pattern count trie, must be less than {@link #TRIE_SIZE} and
+ * greater than the number of occurrences of any pattern in the
+ * dictionary. */
+ private static final int TRIEC_SIZE = 260000;
+
+ /** Size of output hash table, should be a multiple of 510. */
+ private static final short MAX_OPS = 4080;
+
+ /** Maximum pattern length. Also maximum length of external representation
+ * of a "letter". */
+ private static final byte MAX_DOT = 15;
+
+ /** Maximum word length. */
+ private static final byte MAX_LEN = 50;
+
+ /** Maximum length of input lines, must be at least {@link #MAX_LEN}. */
+ private static final short MAX_BUF_LEN = 3000;
+
+ /* End of Constants from TeX code. */
+
+ /* Start of struct definitions from TeX code. */
+ private class Opword {
+ int dot;
+ int val;
+ int op;
+ }
+ /* End of struct definitions from TeX code. */
+
+ /* Start of Variables from TeX code. */
+ int patstart;
+ int patfinish;
+ int hyphstart;
+ int hyphfinish;
+ int goodwt;
+ int badwt;
+ int thresh;
+ char[] xord = new char[WKConstants.MAX_8_BIT_UNSIGNED_VALUES];
+ char[] xchr = new char[WKConstants.MAX_8_BIT_UNSIGNED_VALUES];
+ char[] xclass = new char[WKConstants.MAX_8_BIT_UNSIGNED_VALUES];
+ char[] xint = new char[WKConstants.MAX_8_BIT_UNSIGNED_VALUES];
+ char[] xdig = new char[DIGITS_SIZE];
+ char[] xext = new char[WKConstants.MAX_8_BIT_UNSIGNED_VALUES];
+ char[] xhyf = new char[HYPH_SIZE];
+ char cmax;
+ char[] triec = new char[TRIE_SIZE + 1];
+ int[] triel = new int[TRIE_SIZE + 1];
+ int[] trier = new int[TRIE_SIZE + 1];
+ boolean[] trietaken = new boolean[TRIE_SIZE + 1];
+ char[] triecc = new char[TRIEC_SIZE + 1];
+ int[] triecl = new int[TRIEC_SIZE + 1];
+ int[] triecr = new int[TRIEC_SIZE + 1];
+ boolean[] triectaken = new boolean[TRIEC_SIZE + 1];
+ Opword[] ops = new Opword[MAX_OPS + 1];
+ char[] trieqc = new char[WKConstants.MAX_8_BIT_UNSIGNED_VALUES];
+ int[] trieql = new int[WKConstants.MAX_8_BIT_UNSIGNED_VALUES];
+ int[] trieqr = new int[WKConstants.MAX_8_BIT_UNSIGNED_VALUES];
+ char qmax;
+ char qmaxthresh;
+ int triemax;
+ int triebmax;
+ int triecount;
+ int opcount;
+ char[] pat = new char[MAX_DOT + 1];
+ int patlen;
+ int triecmax;
+ int triecbmax;
+ int trieccount;
+ int trieckmax;
+ int patcount;
+ File dictionary;
+ File patterns;
+ File translate;
+ File patout;
+ File pattmp;
+// char * fname;
+ float badfrac;
+ float denom;
+ float eff;
+ char[] buf = new char[MAX_BUF_LEN + 1];
+ int bufptr;
+ char imax;
+ int lefthyphenmin;
+ int righthyphenmin;
+ int goodpatcount;
+ int badpatcount;
+ int goodcount;
+ int badcount;
+ int misscount;
+ int levelpatterncount;
+ boolean moretocome;
+ char[] word = new char[MAX_LEN + 1];
+ char[] dots = new char[MAX_LEN + 1];
+ char[] dotw = new char[MAX_LEN + 1];
+ int[] hval = new int[MAX_LEN + 1];
+ boolean[] nomore = new boolean[MAX_LEN + 1];
+ int wlen;
+ char wordwt;
+ boolean wtchg;
+ int hyfmin;
+ int hyfmax;
+ int hyflen;
+ char gooddot;
+ char baddot;
+ int dotmin;
+ int dotmax;
+ int dotlen;
+ boolean procesp;
+ boolean hyphp;
+ int patdot;
+ int hyphlevel;
+ char[] filnam = new char[FILENAME_SIZE];
+ int maxpat;
+ int n1;
+ int n2;
+ int n3;
+ int i;
+ int j;
+ int k;
+ int dot1;
+ boolean[] morethislevel = new boolean[MAX_DOT + 1];
+ /* End of Variables from TeX code. */
+
/** The banner printed when the program starts. */
private String banner = "This is the FOray Hyphenation Pattern Generator.";
@@ -128,9 +273,19 @@
if (this.output == null) {
System.out.println("Error: Output cannot be null.");
}
+ patgen();
}
/**
+ * This is where PATGEN actually starts. We initialize the pattern trie, get
+ * |hyph_level| and |pat_len| limits from the terminal, and generate
+ * patterns.
+ */
+ private void patgen() {
+
+ }
+
+ /**
* Command-line interface for the {@link PatternGenerator} class.
* @param args The command-line arguments.
* The arguments are:
Added: trunk/foray/foray-hyphen/src/java/org/foray/hyphen/patgen.c
===================================================================
--- trunk/foray/foray-hyphen/src/java/org/foray/hyphen/patgen.c (rev 0)
+++ trunk/foray/foray-hyphen/src/java/org/foray/hyphen/patgen.c 2007-09-10 18:01:28 UTC (rev 10202)
@@ -0,0 +1,1932 @@
+#define PATGEN
+#include "cpascal.h"
+/* 9999 */
+#define triesize ( 550000L )
+#define triecsize ( 260000L )
+#define maxops ( 4080 )
+#define maxval ( 10 )
+#define maxdot ( 15 )
+#define maxlen ( 50 )
+#define maxbuflen ( 3000 )
+typedef unsigned char ASCIIcode ;
+typedef ASCIIcode textchar ;
+typedef text textfile ;
+typedef unsigned char packedASCIIcode ;
+typedef ASCIIcode internalcode ;
+typedef packedASCIIcode packedinternalcode ;
+typedef char classtype ;
+typedef char digit ;
+typedef char hyftype ;
+typedef unsigned char qindex ;
+typedef integer valtype ;
+typedef integer dottype ;
+typedef integer optype ;
+typedef integer wordindex ;
+typedef integer triepointer ;
+typedef integer triecpointer ;
+typedef struct {
+ dottype dot ;
+ valtype val ;
+ optype op ;
+} opword ;
+dottype patstart, patfinish ;
+valtype hyphstart, hyphfinish ;
+integer goodwt, badwt, thresh ;
+ASCIIcode xord[256] ;
+textchar xchr[256] ;
+classtype xclass[256] ;
+internalcode xint[256] ;
+textchar xdig[10] ;
+textchar xext[256] ;
+textchar xhyf[4] ;
+internalcode cmax ;
+packedinternalcode triec[triesize + 1] ;
+triepointer triel[triesize + 1], trier[triesize + 1] ;
+boolean trietaken[triesize + 1] ;
+packedinternalcode triecc[triecsize + 1] ;
+triecpointer triecl[triecsize + 1], triecr[triecsize + 1] ;
+boolean triectaken[triecsize + 1] ;
+opword ops[maxops + 1] ;
+internalcode trieqc[256] ;
+triepointer trieql[256], trieqr[256] ;
+qindex qmax ;
+qindex qmaxthresh ;
+triepointer triemax ;
+triepointer triebmax ;
+triepointer triecount ;
+optype opcount ;
+internalcode pat[maxdot + 1] ;
+dottype patlen ;
+triecpointer triecmax, triecbmax, trieccount ;
+triecpointer trieckmax ;
+integer patcount ;
+textfile dictionary, patterns, translate, patout, pattmp ;
+char * fname ;
+real badfrac, denom, eff ;
+textchar buf[maxbuflen + 1] ;
+integer bufptr ;
+internalcode imax ;
+dottype lefthyphenmin, righthyphenmin ;
+integer goodpatcount, badpatcount ;
+integer goodcount, badcount, misscount ;
+integer levelpatterncount ;
+boolean moretocome ;
+internalcode word[maxlen + 1] ;
+hyftype dots[maxlen + 1] ;
+digit dotw[maxlen + 1] ;
+valtype hval[maxlen + 1] ;
+boolean nomore[maxlen + 1] ;
+wordindex wlen ;
+digit wordwt ;
+boolean wtchg ;
+wordindex hyfmin, hyfmax, hyflen ;
+hyftype gooddot, baddot ;
+wordindex dotmin, dotmax, dotlen ;
+boolean procesp, hyphp ;
+dottype patdot ;
+valtype hyphlevel ;
+char filnam[9] ;
+valtype maxpat ;
+integer n1, n2, n3 ;
+valtype i ;
+dottype j ;
+dottype k ;
+dottype dot1 ;
+boolean morethislevel[maxdot + 1] ;
+
+#include "patgen.h"
+void
+#ifdef HAVE_PROTOTYPES
+parsearguments ( void )
+#else
+parsearguments ( )
+#endif
+{
+
+#define noptions ( 2 )
+ getoptstruct longoptions[noptions + 1] ;
+ integer getoptreturnval ;
+ cinttype optionindex ;
+ integer currentoption ;
+ currentoption = 0 ;
+ longoptions [currentoption ].name = "help" ;
+ longoptions [currentoption ].hasarg = 0 ;
+ longoptions [currentoption ].flag = 0 ;
+ longoptions [currentoption ].val = 0 ;
+ currentoption = currentoption + 1 ;
+ longoptions [currentoption ].name = "version" ;
+ longoptions [currentoption ].hasarg = 0 ;
+ longoptions [currentoption ].flag = 0 ;
+ longoptions [currentoption ].val = 0 ;
+ currentoption = currentoption + 1 ;
+ longoptions [currentoption ].name = 0 ;
+ longoptions [currentoption ].hasarg = 0 ;
+ longoptions [currentoption ].flag = 0 ;
+ longoptions [currentoption ].val = 0 ;
+ do {
+ getoptreturnval = getoptlongonly ( argc , argv , "" , longoptions ,
+ addressof ( optionindex ) ) ;
+ if ( getoptreturnval == -1 )
+ {
+ ;
+ }
+ else if ( getoptreturnval == '?' )
+ {
+ usage ( "patgen" ) ;
+ }
+ else if ( ( strcmp ( longoptions [optionindex ].name , "help" ) == 0 ) )
+ {
+ usagehelp ( PATGENHELP , nil ) ;
+ }
+ else if ( ( strcmp ( longoptions [optionindex ].name , "version" ) == 0
+ ) )
+ {
+ printversionandexit ( "This is PATGEN, Version 2.3" , nil ,
+ "Frank M. Liang and Peter Breitenlohner" ) ;
+ }
+ } while ( ! ( getoptreturnval == -1 ) ) ;
+ if ( ( optind + 4 != argc ) )
+ {
+ fprintf ( stderr , "%s\n", "patgen: Need exactly four arguments." ) ;
+ usage ( "patgen" ) ;
+ }
+}
+void
+#ifdef HAVE_PROTOTYPES
+initialize ( void )
+#else
+initialize ( )
+#endif
+{
+ integer bad ;
+ textchar i ;
+ ASCIIcode j ;
+ kpsesetprogname ( argv [0 ]) ;
+ parsearguments () ;
+ Fputs ( output , "This is PATGEN, Version 2.3" ) ;
+ fprintf ( output , "%s\n", versionstring ) ;
+ bad = 0 ;
+ if ( 255 < 127 )
+ bad = 1 ;
+ if ( ( 0 != 0 ) || ( 0 != 0 ) )
+ bad = 2 ;
+ if ( ( triecsize < 4096 ) || ( triesize < triecsize ) )
+ bad = 3 ;
+ if ( maxops > triesize )
+ bad = 4 ;
+ if ( maxval > 10 )
+ bad = 5 ;
+ if ( maxbuflen < maxlen )
+ bad = 6 ;
+ if ( bad > 0 )
+ {
+ fprintf ( stderr , "%s%ld\n", "Bad constants---case " , (long)bad ) ;
+ uexit ( 1 ) ;
+ }
+ {register integer for_end; j = 0 ;for_end = 255 ; if ( j <= for_end) do
+ xchr [j ]= ' ' ;
+ while ( j++ < for_end ) ;}
+ xchr [46 ]= '.' ;
+ xchr [48 ]= '0' ;
+ xchr [49 ]= '1' ;
+ xchr [50 ]= '2' ;
+ xchr [51 ]= '3' ;
+ xchr [52 ]= '4' ;
+ xchr [53 ]= '5' ;
+ xchr [54 ]= '6' ;
+ xchr [55 ]= '7' ;
+ xchr [56 ]= '8' ;
+ xchr [57 ]= '9' ;
+ xchr [65 ]= 'A' ;
+ xchr [66 ]= 'B' ;
+ xchr [67 ]= 'C' ;
+ xchr [68 ]= 'D' ;
+ xchr [69 ]= 'E' ;
+ xchr [70 ]= 'F' ;
+ xchr [71 ]= 'G' ;
+ xchr [72 ]= 'H' ;
+ xchr [73 ]= 'I' ;
+ xchr [74 ]= 'J' ;
+ xchr [75 ]= 'K' ;
+ xchr [76 ]= 'L' ;
+ xchr [77 ]= 'M' ;
+ xchr [78 ]= 'N' ;
+ xchr [79 ]= 'O' ;
+ xchr [80 ]= 'P' ;
+ xchr [81 ]= 'Q' ;
+ xchr [82 ]= 'R' ;
+ xchr [83 ]= 'S' ;
+ xchr [84 ]= 'T' ;
+ xchr [85 ]= 'U' ;
+ xchr [86 ]= 'V' ;
+ xchr [87 ]= 'W' ;
+ xchr [88 ]= 'X' ;
+ xchr [89 ]= 'Y' ;
+ xchr [90 ]= 'Z' ;
+ xchr [97 ]= 'a' ;
+ xchr [98 ]= 'b' ;
+ xchr [99 ]= 'c' ;
+ xchr [100 ]= 'd' ;
+ xchr [101 ]= 'e' ;
+ xchr [102 ]= 'f' ;
+ xchr [103 ]= 'g' ;
+ xchr [104 ]= 'h' ;
+ xchr [105 ]= 'i' ;
+ xchr [106 ]= 'j' ;
+ xchr [107 ]= 'k' ;
+ xchr [108 ]= 'l' ;
+ xchr [109 ]= 'm' ;
+ xchr [110 ]= 'n' ;
+ xchr [111 ]= 'o' ;
+ xchr [112 ]= 'p' ;
+ xchr [113 ]= 'q' ;
+ xchr [114 ]= 'r' ;
+ xchr [115 ]= 's' ;
+ xchr [116 ]= 't' ;
+ xchr [117 ]= 'u' ;
+ xchr [118 ]= 'v' ;
+ xchr [119 ]= 'w' ;
+ xchr [120 ]= 'x' ;
+ xchr [121 ]= 'y' ;
+ xchr [122 ]= 'z' ;
+ {register integer for_end; i = chr ( 0 ) ;for_end = chr ( 255 ) ; if ( i
+ <= for_end) do
+ xord [i ]= 0 ;
+ while ( i++ < for_end ) ;}
+ {register integer for_end; j = 0 ;for_end = 255 ; if ( j <= for_end) do
+ xord [xchr [j ]]= j ;
+ while ( j++ < for_end ) ;}
+ xord [' ' ]= 32 ;
+ xord [chr ( 9 ) ]= 32 ;
+ {register integer for_end; i = chr ( 0 ) ;for_end = chr ( 255 ) ; if ( i
+ <= for_end) do
+ {
+ xclass [i ]= 5 ;
+ xint [i ]= 0 ;
+ }
+ while ( i++ < for_end ) ;}
+ xclass [' ' ]= 0 ;
+ {register integer for_end; j = 0 ;for_end = 255 ; if ( j <= for_end) do
+ xext [j ]= ' ' ;
+ while ( j++ < for_end ) ;}
+ xext [1 ]= '.' ;
+ {register integer for_end; j = 0 ;for_end = 9 ; if ( j <= for_end) do
+ {
+ xdig [j ]= xchr [j + 48 ];
+ xclass [xdig [j ]]= 1 ;
+ xint [xdig [j ]]= j ;
+ }
+ while ( j++ < for_end ) ;}
+ xhyf [1 ]= '.' ;
+ xhyf [2 ]= '-' ;
+ xhyf [3 ]= '*' ;
+}
+ASCIIcode
+#ifdef HAVE_PROTOTYPES
+zgetASCII ( textchar c )
+#else
+zgetASCII ( c )
+ textchar c ;
+#endif
+{
+ /* 40 */ register ASCIIcode Result; ASCIIcode i ;
+ i = xord [c ];
+ if ( i == 0 )
+ {
+ while ( i < 255 ) {
+
+ i = i + 1 ;
+ if ( ( xchr [i ]== ' ' ) && ( i != 32 ) )
+ goto lab40 ;
+ }
+ {
+ fprintf ( stderr , "%s%ld%s%s\n", "PATGEN capacity exceeded, sorry [" , (long)256 , " characters" , "]." ) ;
+ uexit ( 1 ) ;
+ }
+ lab40: xord [c ]= i ;
+ xchr [i ]= c ;
+ }
+ Result = i ;
+ return Result ;
+}
+void
+#ifdef HAVE_PROTOTYPES
+initpatterntrie ( void )
+#else
+initpatterntrie ( )
+#endif
+{
+ internalcode c ;
+ optype h ;
+ {register integer for_end; c = 0 ;for_end = 255 ; if ( c <= for_end) do
+ {
+ triec [1 + c ]= c ;
+ triel [1 + c ]= 0 ;
+ trier [1 + c ]= 0 ;
+ trietaken [1 + c ]= false ;
+ }
+ while ( c++ < for_end ) ;}
+ trietaken [1 ]= true ;
+ triebmax = 1 ;
+ triemax = 256 ;
+ triecount = 256 ;
+ qmaxthresh = 5 ;
+ triel [0 ]= triemax + 1 ;
+ trier [triemax + 1 ]= 0 ;
+ {register integer for_end; h = 1 ;for_end = maxops ; if ( h <= for_end) do
+ ops [h ].val = 0 ;
+ while ( h++ < for_end ) ;}
+ opcount = 0 ;
+}
+triepointer
+#ifdef HAVE_PROTOTYPES
+firstfit ( void )
+#else
+firstfit ( )
+#endif
+{
+ /* 40 41 */ register triepointer Result; triepointer s, t ;
+ qindex q ;
+ if ( qmax > qmaxthresh )
+ t = trier [triemax + 1 ];
+ else t = 0 ;
+ while ( true ) {
+
+ t = triel [t ];
+ s = t - trieqc [1 ];
+ if ( s > triesize - 256 )
+ {
+ fprintf ( stderr , "%s%ld%s%s\n", "PATGEN capacity exceeded, sorry [" , (long)triesize , " pattern trie nodes" , "]." ) ;
+ uexit ( 1 ) ;
+ }
+ while ( triebmax < s ) {
+
+ triebmax = triebmax + 1 ;
+ trietaken [triebmax ]= false ;
+ triec [triebmax + 255 ]= 0 ;
+ triel [triebmax + 255 ]= triebmax + 256 ;
+ trier [triebmax + 256 ]= triebmax + 255 ;
+ }
+ if ( trietaken [s ])
+ goto lab41 ;
+ {register integer for_end; q = qmax ;for_end = 2 ; if ( q >= for_end) do
+ if ( triec [s + trieqc [q ]]!= 0 )
+ goto lab41 ;
+ while ( q-- > for_end ) ;}
+ goto lab40 ;
+ lab41: ;
+ }
+ lab40: ;
+ {register integer for_end; q = 1 ;for_end = qmax ; if ( q <= for_end) do
+ {
+ t = s + trieqc [q ];
+ triel [trier [t ]]= triel [t ];
+ trier [triel [t ]]= trier [t ];
+ triec [t ]= trieqc [q ];
+ triel [t ]= trieql [q ];
+ trier [t ]= trieqr [q ];
+ if ( t > triemax )
+ triemax = t ;
+ }
+ while ( q++ < for_end ) ;}
+ trietaken [s ]= true ;
+ Result = s ;
+ return Result ;
+}
+void
+#ifdef HAVE_PROTOTYPES
+zunpack ( triepointer s )
+#else
+zunpack ( s )
+ triepointer s ;
+#endif
+{
+ internalcode c ;
+ triepointer t ;
+ qmax = 1 ;
+ {register integer for_end; c = 1 ;for_end = cmax ; if ( c <= for_end) do
+ {
+ t = s + c ;
+ if ( triec [t ]== c )
+ {
+ trieqc [qmax ]= c ;
+ trieql [qmax ]= triel [t ];
+ trieqr [qmax ]= trier [t ];
+ qmax = qmax + 1 ;
+ trier [triel [0 ]]= t ;
+ triel [t ]= triel [0 ];
+ triel [0 ]= t ;
+ trier [t ]= 0 ;
+ triec [t ]= 0 ;
+ }
+ }
+ while ( c++ < for_end ) ;}
+ trietaken [s ]= false ;
+}
+optype
+#ifdef HAVE_PROTOTYPES
+znewtrieop ( valtype v , dottype d , optype n )
+#else
+znewtrieop ( v , d , n )
+ valtype v ;
+ dottype d ;
+ optype n ;
+#endif
+{
+ /* 10 */ register optype Result; optype h ;
+ h = ( ( n + 313 * d + 361 * v ) % maxops ) + 1 ;
+ while ( true ) {
+
+ if ( ops [h ].val == 0 )
+ {
+ opcount = opcount + 1 ;
+ if ( opcount == maxops )
+ {
+ fprintf ( stderr , "%s%ld%s%s\n", "PATGEN capacity exceeded, sorry [" , (long)maxops , " outputs" , "]." ) ;
+ uexit ( 1 ) ;
+ }
+ ops [h ].val = v ;
+ ops [h ].dot = d ;
+ ops [h ].op = n ;
+ Result = h ;
+ goto lab10 ;
+ }
+ if ( ( ops [h ].val == v ) && ( ops [h ].dot == d ) && ( ops [h ].op
+ == n ) )
+ {
+ Result = h ;
+ goto lab10 ;
+ }
+ if ( h > 1 )
+ h = h - 1 ;
+ else h = maxops ;
+ }
+ lab10: ;
+ return Result ;
+}
+void
+#ifdef HAVE_PROTOTYPES
+zinsertpattern ( valtype val , dottype dot )
+#else
+zinsertpattern ( val , dot )
+ valtype val ;
+ dottype dot ;
+#endif
+{
+ dottype i ;
+ triepointer s, t ;
+ i = 1 ;
+ s = 1 + pat [i ];
+ t = triel [s ];
+ while ( ( t > 0 ) && ( i < patlen ) ) {
+
+ i = i + 1 ;
+ t = t + pat [i ];
+ if ( triec [t ]!= pat [i ])
+ {
+ if ( triec [t ]== 0 )
+ {
+ triel [trier [t ]]= triel [t ];
+ trier [triel [t ]]= trier [t ];
+ triec [t ]= pat [i ];
+ triel [t ]= 0 ;
+ trier [t ]= 0 ;
+ if ( t > triemax )
+ triemax = t ;
+ }
+ else {
+
+ unpack ( t - pat [i ]) ;
+ trieqc [qmax ]= pat [i ];
+ trieql [qmax ]= 0 ;
+ trieqr [qmax ]= 0 ;
+ t = firstfit () ;
+ triel [s ]= t ;
+ t = t + pat [i ];
+ }
+ triecount = triecount + 1 ;
+ }
+ s = t ;
+ t = triel [s ];
+ }
+ trieql [1 ]= 0 ;
+ trieqr [1 ]= 0 ;
+ qmax = 1 ;
+ while ( i < patlen ) {
+
+ i = i + 1 ;
+ trieqc [1 ]= pat [i ];
+ t = firstfit () ;
+ triel [s ]= t ;
+ s = t + pat [i ];
+ triecount = triecount + 1 ;
+ }
+ trier [s ]= newtrieop ( val , dot , trier [s ]) ;
+}
+void
+#ifdef HAVE_PROTOTYPES
+initcounttrie ( void )
+#else
+initcounttrie ( )
+#endif
+{
+ internalcode c ;
+ {register integer for_end; c = 0 ;for_end = 255 ; if ( c <= for_end) do
+ {
+ triecc [1 + c ]= c ;
+ triecl [1 + c ]= 0 ;
+ triecr [1 + c ]= 0 ;
+ triectaken [1 + c ]= false ;
+ }
+ while ( c++ < for_end ) ;}
+ triectaken [1 ]= true ;
+ triecbmax = 1 ;
+ triecmax = 256 ;
+ trieccount = 256 ;
+ trieckmax = 4096 ;
+ triecl [0 ]= triecmax + 1 ;
+ triecr [triecmax + 1 ]= 0 ;
+ patcount = 0 ;
+}
+triecpointer
+#ifdef HAVE_PROTOTYPES
+firstcfit ( void )
+#else
+firstcfit ( )
+#endif
+{
+ /* 40 41 */ register triecpointer Result; triecpointer a, b ;
+ qindex q ;
+ if ( qmax > 3 )
+ a = triecr [triecmax + 1 ];
+ else a = 0 ;
+ while ( true ) {
+
+ a = triecl [a ];
+ b = a - trieqc [1 ];
+ if ( b > trieckmax - 256 )
+ {
+ if ( trieckmax == triecsize )
+ {
+ fprintf ( stderr , "%s%ld%s%s\n", "PATGEN capacity exceeded, sorry [" , (long)triecsize , " count trie nodes" , "]." ) ;
+ uexit ( 1 ) ;
+ }
+ fprintf ( output , "%ld%s", (long)trieckmax / 1024 , "K " ) ;
+ if ( trieckmax > triecsize - 4096 )
+ trieckmax = triecsize ;
+ else trieckmax = trieckmax + 4096 ;
+ }
+ while ( triecbmax < b ) {
+
+ triecbmax = triecbmax + 1 ;
+ triectaken [triecbmax ]= false ;
+ triecc [triecbmax + 255 ]= 0 ;
+ triecl [triecbmax + 255 ]= triecbmax + 256 ;
+ triecr [triecbmax + 256 ]= triecbmax + 255 ;
+ }
+ if ( triectaken [b ])
+ goto lab41 ;
+ {register integer for_end; q = qmax ;for_end = 2 ; if ( q >= for_end) do
+ if ( triecc [b + trieqc [q ]]!= 0 )
+ goto lab41 ;
+ while ( q-- > for_end ) ;}
+ goto lab40 ;
+ lab41: ;
+ }
+ lab40: ;
+ {register integer for_end; q = 1 ;for_end = qmax ; if ( q <= for_end) do
+ {
+ a = b + trieqc [q ];
+ triecl [triecr [a ]]= triecl [a ];
+ triecr [triecl [a ]]= triecr [a ];
+ triecc [a ]= trieqc [q ];
+ triecl [a ]= trieql [q ];
+ triecr [a ]= trieqr [q ];
+ if ( a > triecmax )
+ triecmax = a ;
+ }
+ while ( q++ < for_end ) ;}
+ triectaken [b ]= true ;
+ Result = b ;
+ return Result ;
+}
+void
+#ifdef HAVE_PROTOTYPES
+zunpackc ( triecpointer b )
+#else
+zunpackc ( b )
+ triecpointer b ;
+#endif
+{
+ internalcode c ;
+ triecpointer a ;
+ qmax = 1 ;
+ {register integer for_end; c = 1 ;for_end = cmax ; if ( c <= for_end) do
+ {
+ a = b + c ;
+ if ( triecc [a ]== c )
+ {
+ trieqc [qmax ]= c ;
+ trieql [qmax ]= triecl [a ];
+ trieqr [qmax ]= triecr [a ];
+ qmax = qmax + 1 ;
+ triecr [triecl [0 ]]= a ;
+ triecl [a ]= triecl [0 ];
+ triecl [0 ]= a ;
+ triecr [a ]= 0 ;
+ triecc [a ]= 0 ;
+ }
+ }
+ while ( c++ < for_end ) ;}
+ triectaken [b ]= false ;
+}
+triecpointer
+#ifdef HAVE_PROTOTYPES
+zinsertcpat ( wordindex fpos )
+#else
+zinsertcpat ( fpos )
+ wordindex fpos ;
+#endif
+{
+ register triecpointer Result; wordindex spos ;
+ triecpointer a, b ;
+ spos = fpos - patlen ;
+ spos = spos + 1 ;
+ b = 1 + word [spos ];
+ a = triecl [b ];
+ while ( ( a > 0 ) && ( spos < fpos ) ) {
+
+ spos = spos + 1 ;
+ a = a + word [spos ];
+ if ( triecc [a ]!= word [spos ])
+ {
+ if ( triecc [a ]== 0 )
+ {
+ triecl [triecr [a ]]= triecl [a ];
+ triecr [triecl [a ]]= triecr [a ];
+ triecc [a ]= word [spos ];
+ triecl [a ]= 0 ;
+ triecr [a ]= 0 ;
+ if ( a > triecmax )
+ triecmax = a ;
+ }
+ else {
+
+ unpackc ( a - word [spos ]) ;
+ trieqc [qmax ]= word [spos ];
+ trieql [qmax ]= 0 ;
+ trieqr [qmax ]= 0 ;
+ a = firstcfit () ;
+ triecl [b ]= a ;
+ a = a + word [spos ];
+ }
+ trieccount = trieccount + 1 ;
+ }
+ b = a ;
+ a = triecl [a ];
+ }
+ trieql [1 ]= 0 ;
+ trieqr [1 ]= 0 ;
+ qmax = 1 ;
+ while ( spos < fpos ) {
+
+ spos = spos + 1 ;
+ trieqc [1 ]= word [spos ];
+ a = firstcfit () ;
+ triecl [b ]= a ;
+ b = a + word [spos ];
+ trieccount = trieccount + 1 ;
+ }
+ Result = b ;
+ patcount = patcount + 1 ;
+ return Result ;
+}
+void
+#ifdef HAVE_PROTOTYPES
+readtranslate ( void )
+#else
+readtranslate ( )
+#endif
+{
+ /* 30 */ textchar c ;
+ integer n ;
+ ASCIIcode j ;
+ boolean bad ;
+ boolean lower ;
+ dottype i ;
+ triepointer s, t ;
+ imax = 1 ;
+ fname = cmdline ( 4 ) ;
+ reset ( translate , fname ) ;
+ if ( eof ( translate ) )
+ {
+ lefthyphenmin = 2 ;
+ righthyphenmin = 3 ;
+ {register integer for_end; j = 65 ;for_end = 90 ; if ( j <= for_end) do
+ {
+ imax = imax + 1 ;
+ c = xchr [j + 32 ];
+ xclass [c ]= 3 ;
+ xint [c ]= imax ;
+ xext [imax ]= c ;
+ c = xchr [j ];
+ xclass [c ]= 3 ;
+ xint [c ]= imax ;
+ }
+ while ( j++ < for_end ) ;}
+ }
+ else {
+
+ {
+ bufptr = 0 ;
+ while ( ! eoln ( translate ) ) {
+
+ if ( ( bufptr >= maxbuflen ) )
+ {
+ {
+ bufptr = 0 ;
+ do {
+ bufptr = bufptr + 1 ;
+ putc ( buf [bufptr ], output );
+ } while ( ! ( bufptr == maxbuflen ) ) ;
+ fprintf ( output , "%c\n", ' ' ) ;
+ }
+ {
+ fprintf ( stderr , "%s\n", "Line too long" ) ;
+ uexit ( 1 ) ;
+ }
+ }
+ bufptr = bufptr + 1 ;
+ read ( translate , buf [bufptr ]) ;
+ }
+ readln ( translate ) ;
+ while ( bufptr < maxbuflen ) {
+
+ bufptr = bufptr + 1 ;
+ buf [bufptr ]= ' ' ;
+ }
+ }
+ bad = false ;
+ if ( buf [1 ]== ' ' )
+ n = 0 ;
+ else if ( xclass [buf [1 ]]== 1 )
+ n = xint [buf [1 ]];
+ else bad = true ;
+ if ( xclass [buf [2 ]]== 1 )
+ n = 10 * n + xint [buf [2 ]];
+ else bad = true ;
+ if ( ( n >= 1 ) && ( n < maxdot ) )
+ lefthyphenmin = n ;
+ else bad = true ;
+ if ( buf [3 ]== ' ' )
+ n = 0 ;
+ else if ( xclass [buf [3 ]]== 1 )
+ n = xint [buf [3 ]];
+ else bad = true ;
+ if ( xclass [buf [4 ]]== 1 )
+ n = 10 * n + xint [buf [4 ]];
+ else bad = true ;
+ if ( ( n >= 1 ) && ( n < maxdot ) )
+ righthyphenmin = n ;
+ else bad = true ;
+ if ( bad )
+ {
+ bad = false ;
+ do {
+ Fputs ( output , "left_hyphen_min, right_hyphen_min: " ) ;
+ input2ints ( n1 , n2 ) ;
+ if ( ( n1 >= 1 ) && ( n1 < maxdot ) && ( n2 >= 1 ) && ( n2 < maxdot )
+ )
+ {
+ lefthyphenmin = n1 ;
+ righthyphenmin = n2 ;
+ }
+ else {
+
+ n1 = 0 ;
+ fprintf ( output , "%s%ld%s\n", "Specify 1<=left_hyphen_min,right_hyphen_min<=" , (long)maxdot - 1 , " !" ) ;
+ }
+ } while ( ! ( n1 > 0 ) ) ;
+ }
+ {register integer for_end; j = 1 ;for_end = 3 ; if ( j <= for_end) do
+ {
+ if ( buf [j + 4 ]!= ' ' )
+ xhyf [j ]= buf [j + 4 ];
+ if ( xclass [xhyf [j ]]== 5 )
+ xclass [xhyf [j ]]= 2 ;
+ else bad = true ;
+ }
+ while ( j++ < for_end ) ;}
+ xclass ['.' ]= 2 ;
+ if ( bad )
+ {
+ {
+ bufptr = 0 ;
+ do {
+ bufptr = bufptr + 1 ;
+ putc ( buf [bufptr ], output );
+ } while ( ! ( bufptr == maxbuflen ) ) ;
+ fprintf ( output , "%c\n", ' ' ) ;
+ }
+ {
+ fprintf ( stderr , "%s\n", "Bad hyphenation data" ) ;
+ uexit ( 1 ) ;
+ }
+ }
+ cmax = 254 ;
+ while ( ! eof ( translate ) ) {
+
+ {
+ bufptr = 0 ;
+ while ( ! eoln ( translate ) ) {
+
+ if ( ( bufptr >= maxbuflen ) )
+ {
+ {
+ bufptr = 0 ;
+ do {
+ bufptr = bufptr + 1 ;
+ putc ( buf [bufptr ], output );
+ } while ( ! ( bufptr == maxbuflen ) ) ;
+ fprintf ( output , "%c\n", ' ' ) ;
+ }
+ {
+ fprintf ( stderr , "%s\n", "Line too long" ) ;
+ uexit ( 1 ) ;
+ }
+ }
+ bufptr = bufptr + 1 ;
+ read ( translate , buf [bufptr ]) ;
+ }
+ readln ( translate ) ;
+ while ( bufptr < maxbuflen ) {
+
+ bufptr = bufptr + 1 ;
+ buf [bufptr ]= ' ' ;
+ }
+ }
+ bufptr = 1 ;
+ lower = true ;
+ while ( ! bad ) {
+
+ patlen = 0 ;
+ do {
+ if ( bufptr < maxbuflen )
+ bufptr = bufptr + 1 ;
+ else bad = true ;
+ if ( buf [bufptr ]== buf [1 ])
+ if ( patlen == 0 )
+ goto lab30 ;
+ else {
+
+ if ( lower )
+ {
+ if ( imax == 255 )
+ {
+ {
+ bufptr = 0 ;
+ do {
+ bufptr = bufptr + 1 ;
+ putc ( buf [bufptr ], output );
+ } while ( ! ( bufptr == maxbuflen ) ) ;
+ fprintf ( output , "%c\n", ' ' ) ;
+ }
+ {
+ fprintf ( stderr , "%s%ld%s%s\n", "PATGEN capacity exceeded, sorry [" , (long)256 , " letters" , "]." ) ;
+ uexit ( 1 ) ;
+ }
+ }
+ imax = imax + 1 ;
+ xext [imax ]= xchr [pat [patlen ]];
+ }
+ c = xchr [pat [1 ]];
+ if ( patlen == 1 )
+ {
+ if ( xclass [c ]!= 5 )
+ bad = true ;
+ xclass [c ]= 3 ;
+ xint [c ]= imax ;
+ }
+ else {
+
+ if ( xclass [c ]== 5 )
+ xclass [c ]= 4 ;
+ if ( xclass [c ]!= 4 )
+ bad = true ;
+ i = 0 ;
+ s = 1 ;
+ t = triel [s ];
+ while ( ( t > 1 ) && ( i < patlen ) ) {
+
+ i = i + 1 ;
+ t = t + pat [i ];
+ if ( triec [t ]!= pat [i ])
+ {
+ if ( triec [t ]== 0 )
+ {
+ triel [trier [t ]]= triel [t ];
+ trier [triel [t ]]= trier [t ];
+ triec [t ]= pat [i ];
+ triel [t ]= 0 ;
+ trier [t ]= 0 ;
+ if ( t > triemax )
+ triemax = t ;
+ }
+ else {
+
+ unpack ( t - pat [i ]) ;
+ trieqc [qmax ]= pat [i ];
+ trieql [qmax ]= 0 ;
+ trieqr [qmax ]= 0 ;
+ t = firstfit () ;
+ triel [s ]= t ;
+ t = t + pat [i ];
+ }
+ triecount = triecount + 1 ;
+ }
+ else if ( trier [t ]> 0 )
+ bad = true ;
+ s = t ;
+ t = triel [s ];
+ }
+ if ( t > 1 )
+ bad = true ;
+ trieql [1 ]= 0 ;
+ trieqr [1 ]= 0 ;
+ qmax = 1 ;
+ while ( i < patlen ) {
+
+ i = i + 1 ;
+ trieqc [1 ]= pat [i ];
+ t = firstfit () ;
+ triel [s ]= t ;
+ s = t + pat [i ];
+ triecount = triecount + 1 ;
+ }
+ trier [s ]= imax ;
+ if ( ! lower )
+ triel [s ]= 1 ;
+ }
+ }
+ else if ( patlen == maxdot )
+ bad = true ;
+ else {
+
+ patlen = patlen + 1 ;
+ pat [patlen ]= getASCII ( buf [bufptr ]) ;
+ }
+ } while ( ! ( ( buf [bufptr ]== buf [1 ]) || bad ) ) ;
+ lower = false ;
+ }
+ lab30: if ( bad )
+ {
+ {
+ bufptr = 0 ;
+ do {
+ bufptr = bufptr + 1 ;
+ putc ( buf [bufptr ], output );
+ } while ( ! ( bufptr == maxbuflen ) ) ;
+ fprintf ( output , "%c\n", ' ' ) ;
+ }
+ {
+ fprintf ( stderr , "%s\n", "Bad representation" ) ;
+ uexit ( 1 ) ;
+ }
+ }
+ }
+ }
+ xfclose ( translate , "inputfile" ) ;
+ fprintf ( output , "%s%ld%s%ld%s%ld%s\n", "left_hyphen_min = " , (long)lefthyphenmin , ", right_hyphen_min = " , (long)righthyphenmin , ", " , (long)imax - 1 , " letters" ) ;
+ cmax = imax ;
+}
+void
+#ifdef HAVE_PROTOTYPES
+zfindletters ( triepointer b , dottype i )
+#else
+zfindletters ( b , i )
+ triepointer b ;
+ dottype i ;
+#endif
+{
+ ASCIIcode c ;
+ triepointer a ;
+ dottype j ;
+ triecpointer l ;
+ if ( i == 1 )
+ initcounttrie () ;
+ {register integer for_end; c = 1 ;for_end = 255 ; if ( c <= for_end) do
+ {
+ a = b + c ;
+ if ( triec [a ]== c )
+ {
+ pat [i ]= c ;
+ if ( trier [a ]== 0 )
+ findletters ( triel [a ], i + 1 ) ;
+ else if ( triel [a ]== 0 )
+ {
+ l = 1 + trier [a ];
+ {register integer for_end; j = 1 ;for_end = i - 1 ; if ( j <=
+ for_end) do
+ {
+ if ( triecmax == triecsize )
+ {
+ fprintf ( stderr , "%s%ld%s%s\n", "PATGEN capacity exceeded, sorry [" , (long)triecsize , " count trie nodes" , "]." ) ;
+ uexit ( 1 ) ;
+ }
+ triecmax = triecmax + 1 ;
+ triecl [l ]= triecmax ;
+ l = triecmax ;
+ triecc [l ]= pat [j ];
+ }
+ while ( j++ < for_end ) ;}
+ triecl [l ]= 0 ;
+ }
+ }
+ }
+ while ( c++ < for_end ) ;}
+}
+void
+#ifdef HAVE_PROTOTYPES
+ztraversecounttrie ( triecpointer b , dottype i )
+#else
+ztraversecounttrie ( b , i )
+ triecpointer b ;
+ dottype i ;
+#endif
+{
+ internalcode c ;
+ triecpointer a ;
+ {register integer for_end; c = 1 ;for_end = cmax ; if ( c <= for_end) do
+ {
+ a = b + c ;
+ if ( triecc [a ]== c )
+ {
+ pat [i ]= c ;
+ if ( i < patlen )
+ traversecounttrie ( triecl [a ], i + 1 ) ;
+ else if ( goodwt * triecl [a ]< thresh )
+ {
+ insertpattern ( maxval , patdot ) ;
+ badpatcount = badpatcount + 1 ;
+ }
+ else if ( goodwt * triecl [a ]- badwt * triecr [a ]>= thresh )
+ {
+ insertpattern ( hyphlevel , patdot ) ;
+ goodpatcount = goodpatcount + 1 ;
+ goodcount = goodcount + triecl [a ];
+ badcount = badcount + triecr [a ];
+ }
+ else moretocome = true ;
+ }
+ }
+ while ( c++ < for_end ) ;}
+}
+void
+#ifdef HAVE_PROTOTYPES
+collectcounttrie ( void )
+#else
+collectcounttrie ( )
+#endif
+{
+ goodpatcount = 0 ;
+ badpatcount = 0 ;
+ goodcount = 0 ;
+ badcount = 0 ;
+ moretocome = false ;
+ traversecounttrie ( 1 , 1 ) ;
+ fprintf ( output , "%ld%s%ld%s", (long)goodpatcount , " good and " , (long)badpatcount , " bad patterns added" ) ;
+ levelpatterncount = levelpatterncount + goodpatcount ;
+ if ( moretocome )
+ fprintf ( output , "%s\n", " (more to come)" ) ;
+ else
+ fprintf ( output , "%c\n", ' ' ) ;
+ fprintf ( output , "%s%ld%s%ld%s", "finding " , (long)goodcount , " good and " , (long)badcount , " bad hyphens" ) ;
+ if ( goodpatcount > 0 )
+ {
+ Fputs ( output , ", efficiency = " ) ;
+ printreal ( goodcount / ((double) ( goodpatcount + badcount / ((double) (
+ thresh / ((double) goodwt ) ) ) ) ) , 1 , 2 ) ;
+ putc ('\n', output );
+ }
+ else
+ fprintf ( output , "%c\n", ' ' ) ;
+ fprintf ( output , "%s%ld%s%s%ld%s%ld%s\n", "pattern trie has " , (long)triecount , " nodes, " , "trie_max = " , (long)triemax , ", " , (long)opcount , " outputs" ) ;
+}
+triepointer
+#ifdef HAVE_PROTOTYPES
+zdeletepatterns ( triepointer s )
+#else
+zdeletepatterns ( s )
+ triepointer s ;
+#endif
+{
+ register triepointer Result; internalcode c ;
+ triepointer t ;
+ boolean allfreed ;
+ optype h, n ;
+ allfreed = true ;
+ {register integer for_end; c = 1 ;for_end = cmax ; if ( c <= for_end) do
+ {
+ t = s + c ;
+ if ( triec [t ]== c )
+ {
+ {
+ h = 0 ;
+ ops [0 ].op = trier [t ];
+ n = ops [0 ].op ;
+ while ( n > 0 ) {
+
+ if ( ops [n ].val == maxval )
+ ops [h ].op = ops [n ].op ;
+ else h = n ;
+ n = ops [h ].op ;
+ }
+ trier [t ]= ops [0 ].op ;
+ }
+ if ( triel [t ]> 0 )
+ triel [t ]= deletepatterns ( triel [t ]) ;
+ if ( ( triel [t ]> 0 ) || ( trier [t ]> 0 ) || ( s == 1 ) )
+ allfreed = false ;
+ else {
+
+ triel [trier [triemax + 1 ]]= t ;
+ trier [t ]= trier [triemax + 1 ];
+ triel [t ]= triemax + 1 ;
+ trier [triemax + 1 ]= t ;
+ triec [t ]= 0 ;
+ triecount = triecount - 1 ;
+ }
+ }
+ }
+ while ( c++ < for_end ) ;}
+ if ( allfreed )
+ {
+ trietaken [s ]= false ;
+ s = 0 ;
+ }
+ Result = s ;
+ return Result ;
+}
+void
+#ifdef HAVE_PROTOTYPES
+deletebadpatterns ( void )
+#else
+deletebadpatterns ( )
+#endif
+{
+ optype oldopcount ;
+ triepointer oldtriecount ;
+ triepointer t ;
+ optype h ;
+ oldopcount = opcount ;
+ oldtriecount = triecount ;
+ t = deletepatterns ( 1 ) ;
+ {register integer for_end; h = 1 ;for_end = maxops ; if ( h <= for_end) do
+ if ( ops [h ].val == maxval )
+ {
+ ops [h ].val = 0 ;
+ opcount = opcount - 1 ;
+ }
+ while ( h++ < for_end ) ;}
+ fprintf ( output , "%ld%s%ld%s\n", (long)oldtriecount - triecount , " nodes and " , (long)oldopcount - opcount , " outputs deleted" ) ;
+ qmaxthresh = 7 ;
+}
+void
+#ifdef HAVE_PROTOTYPES
+zoutputpatterns ( triepointer s , dottype patlen )
+#else
+zoutputpatterns ( s , patlen )
+ triepointer s ;
+ dottype patlen ;
+#endif
+{
+ internalcode c ;
+ triepointer t ;
+ optype h ;
+ dottype d ;
+ triecpointer l ;
+ {register integer for_end; c = 1 ;for_end = cmax ; if ( c <= for_end) do
+ {
+ t = s + c ;
+ if ( triec [t ]== c )
+ {
+ pat [patlen ]= c ;
+ h = trier [t ];
+ if ( h > 0 )
+ {
+ {register integer for_end; d = 0 ;for_end = patlen ; if ( d <=
+ for_end) do
+ hval [d ]= 0 ;
+ while ( d++ < for_end ) ;}
+ do {
+ d = ops [h ].dot ;
+ if ( hval [d ]< ops [h ].val )
+ hval [d ]= ops [h ].val ;
+ h = ops [h ].op ;
+ } while ( ! ( h == 0 ) ) ;
+ if ( hval [0 ]> 0 )
+ putc ( xdig [hval [0 ]], patout );
+ {register integer for_end; d = 1 ;for_end = patlen ; if ( d <=
+ for_end) do
+ {
+ l = triecl [1 + pat [d ]];
+ while ( l > 0 ) {
+
+ putc ( xchr [triecc [l ]], patout );
+ l = triecl [l ];
+ }
+ putc ( xext [pat [d ]], patout );
+ if ( hval [d ]> 0 )
+ putc ( xdig [hval [d ]], patout );
+ }
+ while ( d++ < for_end ) ;}
+ putc ('\n', patout );
+ }
+ if ( triel [t ]> 0 )
+ outputpatterns ( triel [t ], patlen + 1 ) ;
+ }
+ }
+ while ( c++ < for_end ) ;}
+}
+void
+#ifdef HAVE_PROTOTYPES
+readword ( void )
+#else
+readword ( )
+#endif
+{
+ /* 30 40 */ textchar c ;
+ triepointer t ;
+ {
+ bufptr = 0 ;
+ while ( ! eoln ( dictionary ) ) {
+
+ if ( ( bufptr >= maxbuflen ) )
+ {
+ {
+ bufptr = 0 ;
+ do {
+ bufptr = bufptr + 1 ;
+ putc ( buf [bufptr ], output );
+ } while ( ! ( bufptr == maxbuflen ) ) ;
+ fprintf ( output , "%c\n", ' ' ) ;
+ }
+ {
+ fprintf ( stderr , "%s\n", "Line too long" ) ;
+ uexit ( 1 ) ;
+ }
+ }
+ bufptr = bufptr + 1 ;
+ read ( dictionary , buf [bufptr ]) ;
+ }
+ readln ( dictionary ) ;
+ while ( bufptr < maxbuflen ) {
+
+ bufptr = bufptr + 1 ;
+ buf [bufptr ]= ' ' ;
+ }
+ }
+ word [1 ]= 1 ;
+ wlen = 1 ;
+ bufptr = 0 ;
+ do {
+ bufptr = bufptr + 1 ;
+ c = buf [bufptr ];
+ switch ( xclass [c ])
+ {case 0 :
+ goto lab40 ;
+ break ;
+ case 1 :
+ if ( wlen == 1 )
+ {
+ if ( xint [c ]!= wordwt )
+ wtchg = true ;
+ wordwt = xint [c ];
+ }
+ else dotw [wlen ]= xint [c ];
+ break ;
+ case 2 :
+ dots [wlen ]= xint [c ];
+ break ;
+ case 3 :
+ {
+ wlen = wlen + 1 ;
+ if ( wlen == maxlen )
+ {
+ {
+ bufptr = 0 ;
+ do {
+ bufptr = bufptr + 1 ;
+ putc ( buf [bufptr ], output );
+ } while ( ! ( bufptr == maxbuflen ) ) ;
+ fprintf ( output , "%c\n", ' ' ) ;
+ }
+ {
+ fprintf ( stderr , "%s%s%ld%s\n", "PATGEN capacity exceeded, sorry [" , "word length=" , (long)maxlen , "]." ) ;
+ uexit ( 1 ) ;
+ }
+ }
+ word [wlen ]= xint [c ];
+ dots [wlen ]= 0 ;
+ dotw [wlen ]= wordwt ;
+ }
+ break ;
+ case 4 :
+ {
+ wlen = wlen + 1 ;
+ if ( wlen == maxlen )
+ {
+ {
+ bufptr = 0 ;
+ do {
+ bufptr = bufptr + 1 ;
+ putc ( buf [bufptr ], output );
+ } while ( ! ( bufptr == maxbuflen ) ) ;
+ fprintf ( output , "%c\n", ' ' ) ;
+ }
+ {
+ fprintf ( stderr , "%s%s%ld%s\n", "PATGEN capacity exceeded, sorry [" , "word length=" , (long)maxlen , "]." ) ;
+ uexit ( 1 ) ;
+ }
+ }
+ {
+ t = 1 ;
+ while ( true ) {
+
+ t = triel [t ]+ xord [c ];
+ if ( triec [t ]!= xord [c ])
+ {
+ {
+ bufptr = 0 ;
+ do {
+ bufptr = bufptr + 1 ;
+ putc ( buf [bufptr ], output );
+ } while ( ! ( bufptr == maxbuflen ) ) ;
+ fprintf ( output , "%c\n", ' ' ) ;
+ }
+ {
+ fprintf ( stderr , "%s\n", "Bad representation" ) ;
+ uexit ( 1 ) ;
+ }
+ }
+ if ( trier [t ]!= 0 )
+ {
+ word [wlen ]= trier [t ];
+ goto lab30 ;
+ }
+ if ( bufptr == maxbuflen )
+ c = ' ' ;
+ else {
+
+ bufptr = bufptr + 1 ;
+ c = buf [bufptr ];
+ }
+ }
+ lab30: ;
+ }
+ dots [wlen ]= 0 ;
+ dotw [wlen ]= wordwt ;
+ }
+ break ;
+ case 5 :
+ {
+ {
+ bufptr = 0 ;
+ do {
+ bufptr = bufptr + 1 ;
+ putc ( buf [bufptr ], output );
+ } while ( ! ( bufptr == maxbuflen ) ) ;
+ fprintf ( output , "%c\n", ' ' ) ;
+ }
+ {
+ fprintf ( stderr , "%s\n", "Bad character" ) ;
+ uexit ( 1 ) ;
+ }
+ }
+ break ;
+ }
+ } while ( ! ( bufptr == maxbuflen ) ) ;
+ lab40: wlen = wlen + 1 ;
+ word [wlen ]= 1 ;
+}
+void
+#ifdef HAVE_PROTOTYPES
+hyphenate ( void )
+#else
+hyphenate ( )
+#endif
+{
+ /* 30 */ wordindex spos, dpos, fpos ;
+ triepointer t ;
+ optype h ;
+ valtype v ;
+ {register integer for_end; spos = wlen - hyfmax ;for_end = 0 ; if ( spos
+ >= for_end) do
+ {
+ nomore [spos ]= false ;
+ hval [spos ]= 0 ;
+ fpos = spos + 1 ;
+ t = 1 + word [fpos ];
+ do {
+ h = trier [t ];
+ while ( h > 0 ) {
+
+ dpos = spos + ops [h ].dot ;
+ v = ops [h ].val ;
+ if ( ( v < maxval ) && ( hval [dpos ]< v ) )
+ hval [dpos ]= v ;
+ if ( ( v >= hyphlevel ) )
+ if ( ( ( fpos - patlen ) <= ( dpos - patdot ) ) && ( ( dpos - patdot
+ ) <= spos ) )
+ nomore [dpos ]= true ;
+ h = ops [h ].op ;
+ }
+ t = triel [t ];
+ if ( t == 0 )
+ goto lab30 ;
+ fpos = fpos + 1 ;
+ t = t + word [fpos ];
+ } while ( ! ( triec [t ]!= word [fpos ]) ) ;
+ lab30: ;
+ }
+ while ( spos-- > for_end ) ;}
+}
+void
+#ifdef HAVE_PROTOTYPES
+changedots ( void )
+#else
+changedots ( )
+#endif
+{
+ wordindex dpos ;
+ {register integer for_end; dpos = wlen - hyfmax ;for_end = hyfmin ; if (
+ dpos >= for_end) do
+ {
+ if ( odd ( hval [dpos ]) )
+ dots [dpos ]= dots [dpos ]+ 1 ;
+ if ( dots [dpos ]== 3 )
+ goodcount = goodcount + dotw [dpos ];
+ else if ( dots [dpos ]== 1 )
+ badcount = badcount + dotw [dpos ];
+ else if ( dots [dpos ]== 2 )
+ misscount = misscount + dotw [dpos ];
+ }
+ while ( dpos-- > for_end ) ;}
+}
+void
+#ifdef HAVE_PROTOTYPES
+outputhyphenatedword ( void )
+#else
+outputhyphenatedword ( )
+#endif
+{
+ wordindex dpos ;
+ triecpointer l ;
+ if ( wtchg )
+ {
+ putc ( xdig [wordwt ], pattmp );
+ wtchg = false ;
+ }
+ {register integer for_end; dpos = 2 ;for_end = wlen - 2 ; if ( dpos <=
+ for_end) do
+ {
+ l = triecl [1 + word [dpos ]];
+ while ( l > 0 ) {
+
+ putc ( xchr [triecc [l ]], pattmp );
+ l = triecl [l ];
+ }
+ putc ( xext [word [dpos ]], pattmp );
+ if ( dots [dpos ]!= 0 )
+ putc ( xhyf [dots [dpos ]], pattmp );
+ if ( dotw [dpos ]!= wordwt )
+ putc ( xdig [dotw [dpos ]], pattmp );
+ }
+ while ( dpos++ < for_end ) ;}
+ l = triecl [1 + word [wlen - 1 ]];
+ while ( l > 0 ) {
+
+ putc ( xchr [triecc [l ]], pattmp );
+ l = triecl [l ];
+ }
+ fprintf ( pattmp , "%c\n", xext [word [wlen - 1 ]]) ;
+}
+void
+#ifdef HAVE_PROTOTYPES
+doword ( void )
+#else
+doword ( )
+#endif
+{
+ /* 22 30 */ wordindex spos, dpos, fpos ;
+ triecpointer a ;
+ boolean goodp ;
+ {register integer for_end; dpos = wlen - dotmax ;for_end = dotmin ; if (
+ dpos >= for_end) do
+ {
+ spos = dpos - patdot ;
+ fpos = spos + patlen ;
+ if ( nomore [dpos ])
+ goto lab22 ;
+ if ( dots [dpos ]== gooddot )
+ goodp = true ;
+ else if ( dots [dpos ]== baddot )
+ goodp = false ;
+ else goto lab22 ;
+ spos = spos + 1 ;
+ a = 1 + word [spos ];
+ while ( spos < fpos ) {
+
+ spos = spos + 1 ;
+ a = triecl [a ]+ word [spos ];
+ if ( triecc [a ]!= word [spos ])
+ {
+ a = insertcpat ( fpos ) ;
+ goto lab30 ;
+ }
+ }
+ lab30: if ( goodp )
+ triecl [a ]= triecl [a ]+ dotw [dpos ];
+ else triecr [a ]= triecr [a ]+ dotw [dpos ];
+ lab22: ;
+ }
+ while ( dpos-- > for_end ) ;}
+}
+void
+#ifdef HAVE_PROTOTYPES
+dodictionary ( void )
+#else
+dodictionary ( )
+#endif
+{
+ goodcount = 0 ;
+ badcount = 0 ;
+ misscount = 0 ;
+ wordwt = 1 ;
+ wtchg = false ;
+ fname = cmdline ( 1 ) ;
+ reset ( dictionary , fname ) ;
+ xclass ['.' ]= 5 ;
+ xclass [xhyf [1 ]]= 2 ;
+ xint [xhyf [1 ]]= 0 ;
+ xclass [xhyf [2 ]]= 2 ;
+ xint [xhyf [2 ]]= 2 ;
+ xclass [xhyf [3 ]]= 2 ;
+ xint [xhyf [3 ]]= 2 ;
+ hyfmin = lefthyphenmin + 1 ;
+ hyfmax = righthyphenmin + 1 ;
+ hyflen = hyfmin + hyfmax ;
+ if ( procesp )
+ {
+ dotmin = patdot ;
+ dotmax = patlen - patdot ;
+ if ( dotmin < hyfmin )
+ dotmin = hyfmin ;
+ if ( dotmax < hyfmax )
+ dotmax = hyfmax ;
+ dotlen = dotmin + dotmax ;
+ if ( odd ( hyphlevel ) )
+ {
+ gooddot = 2 ;
+ baddot = 0 ;
+ }
+ else {
+
+ gooddot = 1 ;
+ baddot = 3 ;
+ }
+ }
+ if ( procesp )
+ {
+ initcounttrie () ;
+ fprintf ( output , "%s%ld%s%ld\n", "processing dictionary with pat_len = " , (long)patlen , ", pat_dot = " , (long)patdot ) ;
+ }
+ if ( hyphp )
+ {
+ strcpy ( filnam , "pattmp. " ) ;
+ filnam [7 ]= xdig [hyphlevel ];
+ rewrite ( pattmp , filnam ) ;
+ fprintf ( output , "%s%c\n", "writing pattmp." , xdig [hyphlevel ]) ;
+ }
+ while ( ! eof ( dictionary ) ) {
+
+ readword () ;
+ if ( wlen >= hyflen )
+ {
+ hyphenate () ;
+ changedots () ;
+ }
+ if ( hyphp )
+ if ( wlen > 2 )
+ outputhyphenatedword () ;
+ if ( procesp )
+ if ( wlen >= dotlen )
+ doword () ;
+ }
+ xfclose ( dictionary , "inputfile" ) ;
+ fprintf ( output , "%c\n", ' ' ) ;
+ fprintf ( output , "%ld%s%ld%s%ld%s\n", (long)goodcount , " good, " , (long)badcount , " bad, " , (long)misscount , " missed" ) ;
+ if ( ( goodcount + misscount ) > 0 )
+ {
+ printreal ( ( 100 * goodcount / ((double) ( goodcount + misscount ) ) ) ,
+ 1 , 2 ) ;
+ Fputs ( output , " %, " ) ;
+ printreal ( ( 100 * badcount / ((double) ( goodcount + misscount ) ) ) , 1
+ , 2 ) ;
+ Fputs ( output , " %, " ) ;
+ printreal ( ( 100 * misscount / ((double) ( goodcount + misscount ) ) ) ,
+ 1 , 2 ) ;
+ fprintf ( output , "%s\n", " %" ) ;
+ }
+ if ( procesp )
+ fprintf ( output , "%ld%s%ld%s%s%ld\n", (long)patcount , " patterns, " , (long)trieccount , " nodes in count trie, " , "triec_max = " , (long)triecmax ) ;
+ if ( hyphp )
+ xfclose ( pattmp , "outputfile" ) ;
+}
+void
+#ifdef HAVE_PROTOTYPES
+readpatterns ( void )
+#else
+readpatterns ( )
+#endif
+{
+ /* 30 40 */ textchar c ;
+ digit d ;
+ dottype i ;
+ triepointer t ;
+ xclass ['.' ]= 3 ;
+ xint ['.' ]= 1 ;
+ levelpatterncount = 0 ;
+ maxpat = 0 ;
+ fname = cmdline ( 2 ) ;
+ reset ( patterns , fname ) ;
+ while ( ! eof ( patterns ) ) {
+
+ {
+ bufptr = 0 ;
+ while ( ! eoln ( patterns ) ) {
+
+ if ( ( bufptr >= maxbuflen ) )
+ {
+ {
+ bufptr = 0 ;
+ do {
+ bufptr = bufptr + 1 ;
+ putc ( buf [bufptr ], output );
+ } while ( ! ( bufptr == maxbuflen ) ) ;
+ fprintf ( output , "%c\n", ' ' ) ;
+ }
+ {
+ fprintf ( stderr , "%s\n", "Line too long" ) ;
+ uexit ( 1 ) ;
+ }
+ }
+ bufptr = bufptr + 1 ;
+ read ( patterns , buf [bufptr ]) ;
+ }
+ readln ( patterns ) ;
+ while ( bufptr < maxbuflen ) {
+
+ bufptr = bufptr + 1 ;
+ buf [bufptr ]= ' ' ;
+ }
+ }
+ levelpatterncount = levelpatterncount + 1 ;
+ patlen = 0 ;
+ bufptr = 0 ;
+ hval [0 ]= 0 ;
+ do {
+ bufptr = bufptr + 1 ;
+ c = buf [bufptr ];
+ switch ( xclass [c ])
+ {case 0 :
+ goto lab40 ;
+ break ;
+ case 1 :
+ {
+ d = xint [c ];
+ if ( d >= maxval )
+ {
+ {
+ bufptr = 0 ;
+ do {
+ bufptr = bufptr + 1 ;
+ putc ( buf [bufptr ], output );
+ } while ( ! ( bufptr == maxbuflen ) ) ;
+ fprintf ( output , "%c\n", ' ' ) ;
+ }
+ {
+ fprintf ( stderr , "%s\n", "Bad hyphenation value" ) ;
+ uexit ( 1 ) ;
+ }
+ }
+ if ( d > maxpat )
+ maxpat = d ;
+ hval [patlen ]= d ;
+ }
+ break ;
+ case 3 :
+ {
+ patlen = patlen + 1 ;
+ hval [patlen ]= 0 ;
+ pat [patlen ]= xint [c ];
+ }
+ break ;
+ case 4 :
+ {
+ patlen = patlen + 1 ;
+ hval [patlen ]= 0 ;
+ {
+ t = 1 ;
+ while ( true ) {
+
+ t = triel [t ]+ xord [c ];
+ if ( triec [t ]!= xord [c ])
+ {
+ {
+ bufptr = 0 ;
+ do {
+ bufptr = bufptr + 1 ;
+ putc ( buf [bufptr ], output );
+ } while ( ! ( bufptr == maxbuflen ) ) ;
+ fprintf ( output , "%c\n", ' ' ) ;
+ }
+ {
+ fprintf ( stderr , "%s\n", "Bad representation" ) ;
+ uexit ( 1 ) ;
+ }
+ }
+ if ( trier [t ]!= 0 )
+ {
+ pat [patlen ]= trier [t ];
+ goto lab30 ;
+ }
+ if ( bufptr == maxbuflen )
+ c = ' ' ;
+ else {
+
+ bufptr = bufptr + 1 ;
+ c = buf [bufptr ];
+ }
+ }
+ lab30: ;
+ }
+ }
+ break ;
+ case 2 :
+ case 5 :
+ {
+ {
+ bufptr = 0 ;
+ do {
+ bufptr = bufptr + 1 ;
+ putc ( buf [bufptr ], output );
+ } while ( ! ( bufptr == maxbuflen ) ) ;
+ fprintf ( output , "%c\n", ' ' ) ;
+ }
+ {
+ fprintf ( stderr , "%s\n", "Bad character" ) ;
+ uexit ( 1 ) ;
+ }
+ }
+ break ;
+ }
+ } while ( ! ( bufptr == maxbuflen ) ) ;
+ lab40: if ( patlen > 0 )
+ {register integer for_end; i = 0 ;for_end = patlen ; if ( i <= for_end)
+ do
+ {
+ if ( hval [i ]!= 0 )
+ insertpattern ( hval [i ], i ) ;
+ if ( i > 1 )
+ if ( i < patlen )
+ if ( pat [i ]== 1 )
+ {
+ {
+ bufptr = 0 ;
+ do {
+ bufptr = bufptr + 1 ;
+ putc ( buf [bufptr ], output );
+ } while ( ! ( bufptr == maxbuflen ) ) ;
+ fprintf ( output , "%c\n", ' ' ) ;
+ }
+ {
+ fprintf ( stderr , "%s\n", "Bad edge_of_word" ) ;
+ uexit ( 1 ) ;
+ }
+ }
+ }
+ while ( i++ < for_end ) ;}
+ }
+ xfclose ( patterns , "inputfile" ) ;
+ fprintf ( output , "%ld%s\n", (long)levelpatterncount , " patterns read in" ) ;
+ fprintf ( output , "%s%ld%s%s%ld%s%ld%s\n", "pattern trie has " , (long)triecount , " nodes, " , "trie_max = " , (long)triemax , ", " , (long)opcount , " outputs" ) ;
+}
+void mainbody() {
+
+ initialize () ;
+ initpatterntrie () ;
+ readtranslate () ;
+ readpatterns () ;
+ procesp = true ;
+ hyphp = false ;
+ do {
+ Fputs ( output , "hyph_start, hyph_finish: " ) ;
+ input2ints ( n1 , n2 ) ;
+ if ( ( n1 >= 1 ) && ( n1 < maxval ) && ( n2 >= 1 ) && ( n2 < maxval ) )
+ {
+ hyphstart = n1 ;
+ hyphfinish = n2 ;
+ }
+ else {
+
+ n1 = 0 ;
+ fprintf ( output , "%s%ld%s\n", "Specify 1<=hyph_start,hyph_finish<=" , (long)maxval - 1 , " !" ) ;
+ }
+ } while ( ! ( n1 > 0 ) ) ;
+ hyphlevel = maxpat ;
+ {register integer for_end; i = hyphstart ;for_end = hyphfinish ; if ( i <=
+ for_end) do
+ {
+ hyphlevel = i ;
+ levelpatterncount = 0 ;
+ if ( hyphlevel > hyphstart )
+ fprintf ( output , "%c\n", ' ' ) ;
+ else if ( hyphstart <= maxpat )
+ fprintf ( output , "%s%ld%s\n", "Largest hyphenation value " , (long)maxpat , " in patterns should be less than hyph_start" ) ;
+ do {
+ Fputs ( output , "pat_start, pat_finish: " ) ;
+ input2ints ( n1 , n2 ) ;
+ if ( ( n1 >= 1 ) && ( n1 <= n2 ) && ( n2 <= maxdot ) )
+ {
+ patstart = n1 ;
+ patfinish = n2 ;
+ }
+ else {
+
+ n1 = 0 ;
+ fprintf ( output , "%s%ld%s\n", "Specify 1<=pat_start<=pat_finish<=" , (long)maxdot , " !" ) ;
+ }
+ } while ( ! ( n1 > 0 ) ) ;
+ do {
+ Fputs ( output , "good weight, bad weight, threshold: " ) ;
+ input3ints ( n1 , n2 , n3 ) ;
+ if ( ( n1 >= 1 ) && ( n2 >= 1 ) && ( n3 >= 1 ) )
+ {
+ goodwt = n1 ;
+ badwt = n2 ;
+ thresh = n3 ;
+ }
+ else {
+
+ n1 = 0 ;
+ fprintf ( output , "%s\n", "Specify good weight, bad weight, threshold>=1 !" ) ;
+ }
+ } while ( ! ( n1 > 0 ) ) ;
+ {register integer for_end; k = 0 ;for_end = maxdot ; if ( k <=
+ for_end) do
+ morethislevel [k ]= true ;
+ while ( k++ < for_end ) ;}
+ {register integer for_end; j = patstart ;for_end = patfinish ; if ( j
+ <= for_end) do
+ {
+ patlen = j ;
+ patdot = patlen / 2 ;
+ dot1 = patdot * 2 ;
+ do {
+ patdot = dot1 - patdot ;
+ dot1 = patlen * 2 - dot1 - 1 ;
+ if ( morethislevel [patdot ])
+ {
+ dodictionary () ;
+ collectcounttrie () ;
+ morethislevel [patdot ]= moretocome ;
+ }
+ } while ( ! ( patdot == patlen ) ) ;
+ {register integer for_end; k = maxdot ;for_end = 1 ; if ( k >=
+ for_end) do
+ if ( ! morethislevel [k - 1 ])
+ morethislevel [k ]= false ;
+ while ( k-- > for_end ) ;}
+ }
+ while ( j++ < for_end ) ;}
+ deletebadpatterns () ;
+ fprintf ( output , "%s%ld%s%ld\n", "total of " , (long)levelpatterncount , " patterns at hyph_level " , (long)hyphlevel ) ;
+ }
+ while ( i++ < for_end ) ;}
+ findletters ( triel [1 ], 1 ) ;
+ fname = cmdline ( 3 ) ;
+ rewrite ( patout , fname ) ;
+ outputpatterns ( 1 , 1 ) ;
+ xfclose ( patout , "outputfile" ) ;
+ procesp = false ;
+ hyphp = true ;
+ Fputs ( output , "hyphenate word list? " ) ;
+ {
+ buf [1 ]= getc ( stdin ) ;
+ readln ( stdin ) ;
+ }
+ if ( ( buf [1 ]== 'Y' ) || ( buf [1 ]== 'y' ) )
+ dodictionary () ;
+ lab9999: ;
+}
Property changes on: trunk/foray/foray-hyphen/src/java/org/foray/hyphen/patgen.c
___________________________________________________________________
Name: svn:keywords
+ "Author Id Rev Date URL"
Name: svn:eol-style
+ native
Modified: trunk/foray/scripts/checkstyle-suppressions.xml
===================================================================
--- trunk/foray/scripts/checkstyle-suppressions.xml 2007-09-06 22:56:55 UTC (rev 10201)
+++ trunk/foray/scripts/checkstyle-suppressions.xml 2007-09-10 18:01:28 UTC (rev 10202)
@@ -13,4 +13,8 @@
<suppress checks="MagicNumber"
files="src.javatest.*"/>
+ <!-- Temporarily suppress all javadoc warnings for the Pattern Generator
+ class. -->
+ <suppress checks=".*" files="src.java.org.foray.hyphen.PatternGenerator.*"/>
+
</suppressions>
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|