Henrik Levkowetz
cd030d3b43
Adding copyright notices to all python files
...
- Legacy-Id: 716
2007-06-27 21:16:34 +00:00
Henrik Levkowetz
de9a7ddbc4
Added the ability to give fill and pre(formatted) switches to the soup2text command
...
- Legacy-Id: 403
2007-06-15 13:28:12 +00:00
Henrik Levkowetz
754ba193ca
A small script to run a diff against the master for one single django URL specified in any of the testurl.list files. Uses environment variable DJANGO_SERVER if set, or http://merlot.tools.ietf.org:31415/ otherwise.
...
- Legacy-Id: 375
2007-06-13 17:26:04 +00:00
Henrik Levkowetz
e2db0d869d
Compact spaces after \n conversion in soup2html.
...
- Legacy-Id: 351
2007-06-12 22:46:30 +00:00
Henrik Levkowetz
aa68d30e85
Tweaking the paragraph filling code some more
...
- Legacy-Id: 346
2007-06-12 20:31:28 +00:00
Henrik Levkowetz
712cd8aa17
Tweak to again avoid space at the beginning of a paragraph.
...
- Legacy-Id: 345
2007-06-12 20:23:09 +00:00
Henrik Levkowetz
890b8a1ada
Fix potential exception in soup2html again.
...
- Legacy-Id: 341
2007-06-12 18:34:26 +00:00
Henrik Levkowetz
6b7137994a
Fix potential exception in soup2html.
...
- Legacy-Id: 340
2007-06-12 18:12:19 +00:00
Henrik Levkowetz
dd37257c0c
Only print the first 100 lines of a long diff. New soup2html code for spacing associated with certain tags.
...
- Legacy-Id: 337
2007-06-12 17:52:07 +00:00
Henrik Levkowetz
aba06af322
Another soup2html() tweak to better avoid indentation at paragraph start.
...
- Legacy-Id: 330
2007-06-12 01:32:05 +00:00
Henrik Levkowetz
541b041cdc
soup2html() tweak to better avoid indentation at paragraph start.
...
- Legacy-Id: 329
2007-06-12 00:55:41 +00:00
Henrik Levkowetz
67eb998901
soup2html() tweak to handle html comments.
...
- Legacy-Id: 328
2007-06-12 00:37:16 +00:00
Henrik Levkowetz
b15c02c830
soup2html() tweak to handle table cells.
...
- Legacy-Id: 326
2007-06-12 00:25:45 +00:00
Henrik Levkowetz
bfcb0e6c78
Two soup2text tweaks.
...
- Legacy-Id: 324
2007-06-11 23:52:51 +00:00
Henrik Levkowetz
1cafcf3e9d
Changed approach to space normalization in soup2text(). Plain whitespace stripping followed by reassembly caused too large information loss. Accompanying changes in generic diff files.
...
- Legacy-Id: 321
2007-06-11 20:28:19 +00:00
Henrik Levkowetz
8e8c3ff5e2
* ietf/tests.py: Remove filetime() again -- not using it.
...
* ietf/utils/soup2text.py: Do line ending normalization.
- Legacy-Id: 315
2007-06-11 17:26:59 +00:00
Henrik Levkowetz
7f512b4889
make soup2text convert numeric character codes (e.g., "'") too.
...
- Legacy-Id: 306
2007-06-11 07:47:56 +00:00
Henrik Levkowetz
0452fca7d2
* ietf/tests.py, in reduce(): add ad-hoc fix for pathologic case of not
...
closing <li> tags. BeautifulSoup can handle it, but the recursive text
rendering code in soup2text recurses too deeply with a sufficiently long
list...
* ietf/tests.py, in setUp(): grab the right tuple element when extracting
the URLs from the url test tuples
* ietf/tests.py, in read_testurls(): close opened file
* ietf/tests.py, in doUrlsTest(): narrower try/except clause, and a new one
* soup2text.py, in para(): undo previous change
- Legacy-Id: 304
2007-06-11 06:13:29 +00:00
Henrik Levkowetz
b42e0728c8
Accept both testurl.list and testurls.list as test url list file names. Output status for good compares, too.
...
- Legacy-Id: 303
2007-06-11 04:43:22 +00:00
Henrik Levkowetz
9b78963547
Fix occasional bad sentence end merges in ietf/utils/soup2text.py.
...
Remove some now unneded exceptions from ietf/testurl.list
- Legacy-Id: 302
2007-06-11 04:22:29 +00:00
Henrik Levkowetz
a7a6d956af
Adding a fix in soup2text for a common pathological case: <br><br> used instead
...
of <p /> to indicate paragraph breaks.
This changes the failed diff for /iesg/telechat/detail/354/ to show only three
differences, where two are whitespace differences and one shows a difference
between '@ietf.org . The' and '@ietf.org . The' and is an artifact of the text
extraction. Will look at fixing that next.
- Legacy-Id: 300
2007-06-11 03:36:08 +00:00
Henrik Levkowetz
7c60b321cd
Add BeautifulSoup.py to the ietf/contrib/ directory so it doesn't have to be installed separately
...
- Legacy-Id: 289
2007-06-10 14:02:11 +00:00
Henrik Levkowetz
06eae09af4
Removing unused imports from ietf/tests.py. Using the right Exception type in soup2html.
...
- Legacy-Id: 283
2007-06-10 11:43:19 +00:00
Henrik Levkowetz
10ce0e07dd
'soup2text' is a html-to-text converter which uses the BeautifulSoup.py module. It converts html to plain paragraph-filled readable text.
...
- Legacy-Id: 277
2007-06-10 11:27:02 +00:00