Robert Sparks
b1585124d6
Improve robustness of pdfization. Tune the test crawler. Commit ready for merge.
...
- Legacy-Id: 19813
2022-01-06 20:17:55 +00:00
Robert Sparks
3697180cc1
Reverted merge of timezone-aware migration efforts.
...
- Legacy-Id: 18792
2021-01-12 16:54:20 +00:00
Henrik Levkowetz
774e752a54
Snapshot of timezone-aware datatracker code. Tests pass, and the test-crawler shows only expected differences. Trunk changes merged in up to r18768.
...
- Legacy-Id: 18770
2020-12-16 23:53:37 +00:00
Henrik Levkowetz
7ee6bd4fb4
When doing test-crawling, ignore variations of the 'next=' query arg. (The code ignores other query args if 'next' is given).
...
- Legacy-Id: 18730
2020-12-04 16:04:01 +00:00
Henrik Levkowetz
5d03afa6aa
Reduced the number of htmlization URLs visited further.
...
- Legacy-Id: 17999
2020-06-16 20:07:11 +00:00
Henrik Levkowetz
221e989754
Fixed a bad regex in test-crawl
...
- Legacy-Id: 17970
2020-06-11 09:22:26 +00:00
Henrik Levkowetz
516f41e5d7
Excluded a majority of htmlized drafts at /doc/html (but keeping some for testing) in order to reduce the crawl time.
...
- Legacy-Id: 17918
2020-06-06 20:58:04 +00:00
Henrik Levkowetz
690fb3a370
Added a bunch of drafts for which we don't have text files to the test-crawler exclusion list.
...
- Legacy-Id: 17805
2020-05-16 13:50:45 +00:00
Henrik Levkowetz
695b6e0e86
Tweaked test-crawl to not visit all 180.000 /html/ pages.
...
- Legacy-Id: 17763
2020-05-08 18:49:33 +00:00
Henrik Levkowetz
25af6fbfad
Updated the test crawler for python3.
...
- Legacy-Id: 16438
2019-07-08 19:37:10 +00:00
Henrik Levkowetz
8d1d0cda97
Added a no-follow option to the test crawler, in order to be able to easily test a specific list of URLs.
...
- Legacy-Id: 16188
2019-05-06 13:35:29 +00:00
Henrik Levkowetz
b48caef487
Tweaked the test-crawler to not follow redirects to www.ietf.org. Asking the test client for non-datatracker URLs doesn't give back anything meaningful ,:-)
...
- Legacy-Id: 14930
2018-03-26 13:01:38 +00:00
Henrik Levkowetz
ef99946ca9
Fixed a bug in the handling of checks failures.
...
- Legacy-Id: 14477
2017-12-30 18:46:13 +00:00
Henrik Levkowetz
5bcecc7c54
Fixed a bug and added an url exception for some redirected urls in the test crawler.
...
- Legacy-Id: 13992
2017-07-28 12:50:39 +00:00
Henrik Levkowetz
eb610d2d94
Increased the test crawlers verbose output.
...
- Legacy-Id: 13685
2017-06-19 23:31:53 +00:00
Lars Eggert
76a3c8bdc0
Update vnu.jar and fix various HTML5 nits it found during a test crawl.
...
Commit ready for merge.
- Legacy-Id: 13118
2017-03-25 20:21:14 +00:00
Henrik Levkowetz
5bb9518b5f
Added some new exceptions to the test-crawler; files which are known to not exist, and files with known html character problems.
...
- Legacy-Id: 13037
2017-03-20 13:46:23 +00:00
Henrik Levkowetz
7296b951ee
Refined the test crawler a bit, to avoid extracting URLs to follow
...
from html outside the datatracker's control, such as uploaded WG
agendas. Also excempted some pages with known-bad character issues
from html validation, and refined the error reporting for html
validation failures.
- Legacy-Id: 13027
2017-03-19 19:34:50 +00:00
Henrik Levkowetz
a78c419845
Removed a debug print statement.
...
- Legacy-Id: 12870
2017-02-17 17:53:26 +00:00
Henrik Levkowetz
c344a18bdf
Fixed an issue with the test-crawler which could cause false positives for urls containing apostrophe.
...
- Legacy-Id: 12851
2017-02-16 09:58:34 +00:00
Henrik Levkowetz
9a3f6b059b
Merged Django-1.8 upgrade work to trunk. Adjusted migration names, and added migrations as necessary. Fixed some instances of broken html.
...
- Legacy-Id: 12507
2016-12-13 05:55:46 +00:00
Henrik Levkowetz
44269f1d73
Added an URL to skip to the test-crawler
...
- Legacy-Id: 12500
2016-12-09 13:04:22 +00:00
Henrik Levkowetz
fde59c1e1e
Removed debugging code.
...
- Legacy-Id: 12420
2016-11-29 22:20:30 +00:00
Henrik Levkowetz
bb9741193c
Added an url to skip (from an uploaded html agenda).
...
- Legacy-Id: 12400
2016-11-28 13:38:31 +00:00
Henrik Levkowetz
8e11c7cb64
Fixed some invalid html, and tweaked the html validation settings in the test crawler.
...
- Legacy-Id: 12066
2016-09-30 18:47:56 +00:00
Henrik Levkowetz
4b0a9360f0
Merged in ^/branch/iola/event-saving-refactor-r10291, which refactors document saving to always use doc.save_with_history(events), and requires accompanying events. This branch also provides refactoring of recurring regexes in url patterns into a dictionary. As part of the merge, also refactored new code which didn't use the save_with_history() method.
...
- Legacy-Id: 11840
2016-08-23 10:52:08 +00:00
Henrik Levkowetz
3d48650c0d
Another test-crawler tweak.
...
- Legacy-Id: 11433
2016-06-20 22:47:04 +00:00
Henrik Levkowetz
de0753fa76
Tweaked the test crawler a bit to skip some slow and meaningless checks.
...
- Legacy-Id: 11431
2016-06-20 22:03:06 +00:00
Henrik Levkowetz
aee36651a5
Tweaked the test-crawler to give the same log line format for exception failures as for regular log lines.
...
- Legacy-Id: 10936
2016-03-16 13:21:02 +00:00
Ole Laursen
86c3a430d1
Merge in ^/branch/iola/event-saving-refactor-r10076, fixing a few problems
...
- Legacy-Id: 10298
2015-10-27 10:37:06 +00:00
Henrik Levkowetz
f41553f3d1
Added 2 new file existence checks to the check framework, since we're now reading email aliases for groups and documents from files. Added a call out to run_checks() in the test-crawler, so we don't see failures due to missing files.
...
- Legacy-Id: 10204
2015-10-13 19:07:11 +00:00
Henrik Levkowetz
cfefc0ae58
Changed the default settings for the test crawler from ietf.settings to ietf.settings_testcrawl.
...
- Legacy-Id: 10120
2015-10-01 20:54:46 +00:00
Ole Laursen
5e4645d7d2
Summary: Trim the test-crawl imports
...
- Legacy-Id: 10107
2015-09-29 13:21:24 +00:00
Henrik Levkowetz
11411d2c30
Merged in an update from trunk@9942.
...
- Legacy-Id: 9961
2015-08-03 14:12:38 +00:00
Henrik Levkowetz
f48452853f
Changed test-crawl to avoid unnecessary repetitions of the blacklisting message.
...
- Legacy-Id: 9933
2015-08-01 12:47:03 +00:00
Henrik Levkowetz
948804f73f
Added static javascript and image files to the URLs crawled by the test-crawler.
...
- Legacy-Id: 9913
2015-07-29 17:03:32 +00:00
Henrik Levkowetz
224fef557c
Added a --random switch to choose between different test-crawler modes.
...
- Legacy-Id: 9893
2015-07-27 16:52:26 +00:00
Henrik Levkowetz
8612ce92c0
Merged in [9765] from lars@netapp.com:
...
Add option to crawl as a logged-in user (--user).
Add --pedantic option for vnu crawl, which stops the crawl on (most) errors.
Randomize the order in which URLs are crawled, so that repeated crawls don't
hit the same URLs in the same order.
- Legacy-Id: 9785
Note: SVN reference [9765] has been migrated to Git commit 9b4e61049a704127e1200549fcc410326efffddb
2015-07-18 12:00:37 +00:00
Henrik Levkowetz
ed66e24e7c
Merged in [9726] from lars@netapp.com:
...
Add HTML5 validation based on validator.nu to test-crawl.
- Legacy-Id: 9763
Note: SVN reference [9726] has been migrated to Git commit 5826bcbf80
2015-07-18 08:20:35 +00:00
Lars Eggert
5826bcbf80
Add HTML5 validation based on validator.nu to test-crawl. Commit ready for merge.
...
- Legacy-Id: 9726
2015-07-15 12:41:09 +00:00
Henrik Levkowetz
926b5831d6
Tweaked the test-crawl summary.
...
- Legacy-Id: 9574
2015-04-27 08:33:36 +00:00
Henrik Levkowetz
60738dc8bd
Don't use non-zero exit code for test-crawler runs with nonvalidating html warnings.
...
- Legacy-Id: 9559
2015-04-25 06:36:22 +00:00
Henrik Levkowetz
eadf421fc1
Added a new url folding operation for the html verification.
...
- Legacy-Id: 9557
2015-04-24 22:11:34 +00:00
Henrik Levkowetz
e32af567ef
Added html validation to the test crawler; it will now report html which fails validation with 'WARN' indications. Reorganized the code somewhat, collecting functions, globals, etc. in groups.
...
- Legacy-Id: 9549
2015-04-24 20:30:46 +00:00
Henrik Levkowetz
7c67e26fa4
Added a --logfile switch to the test crawler, in order to be able to control whether a logfile should be used or not. It's not particularly hepful when running on a buildbot slave, which catches stdout anyway.
...
- Legacy-Id: 9252
2015-03-19 20:28:25 +00:00
Henrik Levkowetz
86997e1e95
Turned the api.py file into a module. Moved the makeresources management command to the api module. Added some api tests. Added crawling of api files to the test-crawler. Adjusted some resource files discovered by the test suite and test-crawler. Removed a bunch of empty model files.
...
- Legacy-Id: 9144
2015-03-03 20:23:36 +00:00
Henrik Levkowetz
7ecfac6308
Merged in personal/henrik/django-1.7@9020 which upgrades Django from 1.6.0 to 1.7.4 and applies the needed changes to the datatracker code to work with release 1.7.x.
...
- Legacy-Id: 9028
2015-02-08 21:16:44 +00:00
Henrik Levkowetz
028b7e315a
Reverted to [9025] because commit [9026] failed (it was incomplete with a broken working dir).
...
- Legacy-Id: 9027
Note: SVN reference [9026] has been migrated to Git commit 4a3749a66b
2015-02-08 20:03:16 +00:00
Henrik Levkowetz
4a3749a66b
Merged in personal/henrik/django-1.7@9020 which upgrades Django from 1.6.0 to 1.7.4 and applies the needed changes to the datatracker code to work with release 1.7.x.
...
- Legacy-Id: 9026
2015-02-08 19:16:46 +00:00
Henrik Levkowetz
1210f77604
With django 1.7, standalone scripts need to call django.setup() before doing any operations involving models. Modified all scripts in bin/ and ietf/bin/ which seemed to need it.
...
- Legacy-Id: 9017
2015-02-07 21:13:38 +00:00