1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300
|
2006-09-22 Michael D. Stenner <mstenner@linux.duke.edu>
* urlgrabber/__init__.py:
release 3.1.0
2006-09-22 Michael D. Stenner <mstenner@linux.duke.edu>
* test/grabberperf.py, urlgrabber/grabber.py,
urlgrabber/keepalive.py, urlgrabber/sslfactory.py:
Added James Bowes' SSL patch to use M2Crypto when available.
2006-09-21 Michael D. Stenner <mstenner@linux.duke.edu>
* ChangeLog:
updated ChangeLog
2006-09-21 Michael D. Stenner <mstenner@linux.duke.edu>
* urlgrabber/__init__.py:
release 3.0.0
2006-07-20 Michael D. Stenner <mstenner@linux.duke.edu>
* ChangeLog:
updated ChangeLog
2006-07-20 Michael D. Stenner <mstenner@linux.duke.edu>
* urlgrabber/__init__.py:
release 2.9.10
2006-07-20 Michael D. Stenner <mstenner@linux.duke.edu>
* urlgrabber/: byterange.py, grabber.py, keepalive.py:
Added patch from James Bowes to make keepalive, byteranges, etc.
work with https (oops!)
2006-04-04 Michael D. Stenner <mstenner@linux.duke.edu>
* urlgrabber/keepalive.py:
Fixed a minor error reporting bug due to changes in python 2.4.
2006-03-22 Michael D. Stenner <mstenner@linux.duke.edu>
* urlgrabber/grabber.py:
Patch from Jeremy Katz to catch read errors after the file has been
opened.
2006-03-02 Michael D. Stenner <mstenner@linux.duke.edu>
* ChangeLog:
updated ChangeLog
2006-03-02 Michael D. Stenner <mstenner@linux.duke.edu>
* urlgrabber/__init__.py:
release 2.9.9
2006-03-02 Michael D. Stenner <mstenner@linux.duke.edu>
* test/test_grabber.py:
Added tests to make sure that the "quote" option works as
advertised.
2006-03-02 Michael D. Stenner <mstenner@linux.duke.edu>
* scripts/urlgrabber, test/test_grabber.py, urlgrabber/grabber.py:
Significant improvement to URL parsing. Parsing is now broken out
into a separate class (URLParser). It will now (by default) guess
whether a URL is already quoted, properly handle local files and
URLs on windows, and display un-quoted versions of the filename in
the progress meter.
2006-02-22 Michael D. Stenner <mstenner@linux.duke.edu>
* ChangeLog:
updated ChangeLog
2006-02-22 Michael D. Stenner <mstenner@linux.duke.edu>
* urlgrabber/__init__.py:
release 2.9.8
2006-02-22 Michael D. Stenner <mstenner@linux.duke.edu>
* urlgrabber/: grabber.py, mirror.py:
Added a reget progress bar patch from Menno, and fixed the annoying
_next IndexError bug. Thanks to Edinelson Keiji Shimokawa for
getting me looking in the right direction.
2005-10-22 Michael D. Stenner <mstenner@linux.duke.edu>
* ChangeLog:
updated ChangeLog
2005-10-22 Michael D. Stenner <mstenner@linux.duke.edu>
* ChangeLog, TODO, urlgrabber/__init__.py:
Updated TODO and preparing for release.
2005-10-22 Michael D. Stenner <mstenner@linux.duke.edu>
* test/test_grabber.py, test/test_keepalive.py,
test/test_mirror.py, test/support/testdata/test_post.php,
urlgrabber/byterange.py, urlgrabber/grabber.py,
urlgrabber/keepalive.py, urlgrabber/mirror.py:
Added some slick logging support to urlgrabber. It can be
controlled by the calling application or enabled from an
environment variable.
Allowed for HTTP POSTs by passing in the kwarg "data", just as for
urllib2.
2005-08-19 Michael D. Stenner <mstenner@linux.duke.edu>
* urlgrabber/progress.py:
Another minor simplification offered by the WTF folks.
2005-08-18 Michael D. Stenner <mstenner@linux.duke.edu>
* urlgrabber/progress.py:
Fixed a minor formatting bug, helpfully pointed out by the nice
folks at the daily WTF:
http://thedailywtf.com/forums/41204/ShowPost.aspx
2005-08-17 Michael D. Stenner <mstenner@linux.duke.edu>
* test/test_grabber.py, urlgrabber/grabber.py:
Added Menno's idea for catching KeyboardInterrupt. This allows the
program to do things like hit the next retry, jump to the next
mirror, etc. when the user hits ctrl-c.
Also modified the behavior of failure_callback slightly. It now
gets called for EVERY failure, even the last one. That is, if
there's a failure on the last try, the failure callback gets called
and (assuming it doesn't raise anything) the exception gets raised.
Previously, it would not be called on the last failure. This is
another MINOR compatibility break.
2005-08-17 Michael D. Stenner <mstenner@linux.duke.edu>
* test/test_grabber.py, test/test_mirror.py, urlgrabber/grabber.py:
except HTTPError, which basically means all HTTP return codes that
are not 200 and not handled internally (such as the 401
Unauthorized code). Previously, these were caught as IOErrors (of
which HTTPError is an indirect subclass) and re-raised as
URLGrabErrors with errno 4. They are NOW reraised as URLGrabErrors
with errno 14, and with the .code and .exception attributes set to
the HTTP status code and HTTPError exception object respectively.
This represents a minor compatibility break, as these "errors" are
now not handled the same way. They will be raised with a different
errno and errno 14 is not in the default list of retrycodes
(whereas 4 is).
2005-06-27 Michael D. Stenner <mstenner@linux.duke.edu>
* ChangeLog, urlgrabber/byterange.py:
Fixed two namespace-related bugs.
2005-05-19 Michael D. Stenner <mstenner@linux.duke.edu>
* test/test_keepalive.py, urlgrabber/keepalive.py:
started using sys.version_info rather than parsing sys.version
2005-04-04 Michael D. Stenner <mstenner@linux.duke.edu>
* urlgrabber/keepalive.py:
Fixed a typo in that last change. Always remember to do "make
test" before committing :)
2005-04-04 Michael D. Stenner <mstenner@linux.duke.edu>
* urlgrabber/keepalive.py:
Changed the version checking so SUSE's python "2.3+" version string
doesn't cause trouble.
2005-03-14 Michael D. Stenner <mstenner@linux.duke.edu>
* test/test_grabber.py, urlgrabber/grabber.py:
Added tests to check proxy format. Otherwise, missing proto could
lead to very obscure exceptions.
2005-03-08 Michael D. Stenner <mstenner@linux.duke.edu>
* ChangeLog:
updated ChangeLog
2005-03-08 Michael D. Stenner <mstenner@linux.duke.edu>
* urlgrabber/__init__.py:
release 2.9.6
2005-03-08 Michael D. Stenner <mstenner@linux.duke.edu>
* makefile, setup.py:
Changed development status to Beta an fixed a small build bug. It
wasn't including MANIFEST and so wasn't building things properly
from the distributed tarballs.
2005-03-03 Michael D. Stenner <mstenner@linux.duke.edu>
* ChangeLog:
updated ChangeLog
2005-03-03 Michael D. Stenner <mstenner@linux.duke.edu>
* urlgrabber/__init__.py:
release 2.9.5
2005-03-03 Michael D. Stenner <mstenner@linux.duke.edu>
* urlgrabber/grabber.py:
Fixed a bug caused by the recent proxy fix for python 2.2. This
new bug came up when both the FTPHandler and FTPRangeHandler were
pushed in.
2005-02-28 Michael D. Stenner <mstenner@linux.duke.edu>
* ChangeLog:
updated ChangeLog
2005-02-28 Michael D. Stenner <mstenner@linux.duke.edu>
* urlgrabber/__init__.py:
release 2.9.4
2005-02-28 Michael D. Stenner <mstenner@linux.duke.edu>
* urlgrabber/grabber.py:
Added http_headers and ftp_headers options.
2005-02-25 Michael D. Stenner <mstenner@linux.duke.edu>
* TODO, test/base_test_code.py, test/test_grabber.py,
test/support/squid/squid-setup, test/support/squid/squid.conf,
test/support/testdata/README, test/support/testdata/reference,
test/support/testdata/short_reference,
test/support/testdata/mirror/broken/broken.txt,
test/support/testdata/mirror/broken/reference,
test/support/testdata/mirror/broken/short_reference,
test/support/testdata/mirror/broken/test1.txt,
test/support/testdata/mirror/broken/test2.txt,
test/support/testdata/mirror/m1/broken.txt,
test/support/testdata/mirror/m1/reference,
test/support/testdata/mirror/m1/short_reference,
test/support/testdata/mirror/m1/test1.txt,
test/support/testdata/mirror/m1/test2.txt,
test/support/testdata/mirror/m2/broken.txt,
test/support/testdata/mirror/m2/reference,
test/support/testdata/mirror/m2/short_reference,
test/support/testdata/mirror/m2/test1.txt,
test/support/testdata/mirror/m2/test2.txt,
test/support/testdata/mirror/m3/broken.txt,
test/support/testdata/mirror/m3/reference,
test/support/testdata/mirror/m3/short_reference,
test/support/testdata/mirror/m3/test1.txt,
test/support/testdata/mirror/m3/test2.txt,
test/support/vsftpd/ftp-server-setup,
test/support/vsftpd/vsftpd.conf,
test/support/vsftpd/vsftpd.ftpusers,
test/support/vsftpd/vsftpd.user_list, urlgrabber/grabber.py:
Added data and instructions for setting up ftp and proxy servers
for testing and also added proxy tests. Fixed two bugs related to
proxies. One bug made it so the first time proxy data was passed
in, it was not actually used. The other prevented proxies from
working for ftp or non-keepalive http using python 2.2. The latter
is a bug in urllib2, but is now worked around (somewhat hackishly,
I admit) in urlgrabber. LocalWords: CVS
2005-02-22 Michael D. Stenner <mstenner@linux.duke.edu>
* ChangeLog:
updated ChangeLog
2005-02-22 Michael D. Stenner <mstenner@linux.duke.edu>
* urlgrabber/__init__.py:
release 2.9.3
2005-02-21 Michael D. Stenner <mstenner@linux.duke.edu>
* README:
updated README with instructions for building rpms and notes about
python version compatibility.
2005-02-21 Michael D. Stenner <mstenner@linux.duke.edu>
* urlgrabber/__init__.py:
updated project url
2005-02-21 Michael D. Stenner <mstenner@linux.duke.edu>
* urlgrabber/grabber.py:
Fixed minor exception bug.
2005-02-14 Michael D. Stenner <mstenner@linux.duke.edu>
* ChangeLog:
updated ChangeLog
2005-02-14 Michael D. Stenner <mstenner@linux.duke.edu>
* urlgrabber/__init__.py:
release 2.9.2
2005-02-14 Michael D. Stenner <mstenner@linux.duke.edu>
* urlgrabber/grabber.py:
Added http character encoding patch from Chris Lumens.
2005-02-14 Michael D. Stenner <mstenner@linux.duke.edu>
* test/test_grabber.py:
slight cleanup - no new code
2005-02-14 Michael D. Stenner <mstenner@linux.duke.edu>
* ChangeLog:
updated ChangeLog
2005-02-14 Michael D. Stenner <mstenner@linux.duke.edu>
* urlgrabber/__init__.py:
release 2.9.1
2005-02-14 Michael D. Stenner <mstenner@linux.duke.edu>
* test/test_keepalive.py, test/test_mirror.py,
urlgrabber/byterange.py, urlgrabber/grabber.py,
urlgrabber/keepalive.py:
Fixed python 2.4 bug - added .code attribute to returned file
objects. Changed keepalive.HANDLE_ERRORS behavior for the way 2.4
does things.
2005-02-08 Ryan Tomayko <rtomayko@naeblis.cx>
* makefile:
Added Python 2.4 to lists of pythons to test..
2005-02-04 Ryan Tomayko <rtomayko@naeblis.cx>
* urlgrabber/byterange.py:
Fixed RH bug #147124. range_tuple_header was not accounting for a
possible None value.
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=147124
2005-01-14 Ryan Tomayko <rtomayko@naeblis.cx>
* test/test_grabber.py:
Fixed proftpd test (was raising error when it shouldn't have been)
2005-01-14 Ryan Tomayko <rtomayko@naeblis.cx>
* urlgrabber/: grabber.py, progress.py:
Applied urlgrabber-add-text.patch from Terje Rosten
2004-12-12 Ryan Tomayko <rtomayko@naeblis.cx>
* test/base_test_code.py, test/test_grabber.py,
urlgrabber/byterange.py:
Fix for RH bug #140387
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=140387
2004-10-17 Ryan Tomayko <rtomayko@naeblis.cx>
* urlgrabber/grabber.py:
Added except for socket resource error when reading from urllib2
file object.
2004-10-16 Ryan Tomayko <rtomayko@naeblis.cx>
* urlgrabber/grabber.py:
Fixed AttributeError problem that was pretty common after additions
for timeout support.
2004-10-08 Ryan Tomayko <rtomayko@naeblis.cx>
* TODO, urlgrabber/grabber.py:
Added timeout support.
2004-09-07 Michael D. Stenner <mstenner@linux.duke.edu>
* TODO, scripts/urlgrabber, urlgrabber/__init__.py,
urlgrabber/grabber.py, urlgrabber/mirror.py,
urlgrabber/progress.py:
Added Ryan's progress changes in. That still has some quirks I
need to iron out.
Made all urlgrabber imports relative to work more nicely with yum's
package layout. This was straightforward except for the
__version__ import, which I solved in a slightly icky way.
2004-08-25 Michael D. Stenner <mstenner@linux.duke.edu>
* MANIFEST.in:
Added "MANIFEST" to MANIFEST.in. Without this, one couldn't
correctly rebuild an rpm (for example) from the distributed
tarball.
2004-08-24 Ryan Tomayko <rtomayko@naeblis.cx>
* test/threading/batchgrabber.py:
Updates for Michael's multifile progress bar changes.
2004-08-21 Michael D. Stenner <mstenner@linux.duke.edu>
* makefile, urlgrabber/__init__.py:
Fixed a typo in the new 'release' target. Reformatted the doc in
__init__.py to be under 80 columns in PKG-INFO.
2004-08-21 Michael D. Stenner <mstenner@linux.duke.edu>
* ChangeLog:
updated ChangeLog
2004-08-21 Michael D. Stenner <mstenner@linux.duke.edu>
* urlgrabber/__init__.py:
release 2.9.0
2004-08-21 Michael D. Stenner <mstenner@linux.duke.edu>
* makefile:
Added "release" and "daily" targets to makefile for automated
releases.
2004-08-20 Michael D. Stenner <mstenner@linux.duke.edu>
* test/: test_grabber.py, test_mirror.py:
Updated callback tests to accomodate the new callback semantics.
Also added many new checkfunc tests to for URLGrabber.
2004-08-20 Michael D. Stenner <mstenner@linux.duke.edu>
* urlgrabber/: grabber.py, mirror.py:
Changed the callback syntax. This affects failure_callback use in
both URLGrabber and MirrorGroup instances. It also affects
checkfunc use in URLGrabber instances. If you use these features,
you WILL need to change your code.
2004-08-12 Michael D. Stenner <mstenner@linux.duke.edu>
* test/test_mirror.py:
enhanced the failover test to ensure that the bad mirror is tried
and fails
2004-08-12 Michael D. Stenner <mstenner@linux.duke.edu>
* test/test_grabber.py, urlgrabber/grabber.py:
fixed typo in _make_callback and added testcase
2004-08-11 Michael D. Stenner <mstenner@linux.duke.edu>
* makefile, test/test_grabber.py, urlgrabber/grabber.py:
Fixed a bug in URLGrabber._retry that prevented failure callbacks
from working. Also added a testcase for failure callbacks.
2004-08-09 Michael D. Stenner <mstenner@linux.duke.edu>
* urlgrabber/mirror.py:
fixed documentation typos
2004-07-22 Michael D. Stenner <mstenner@linux.duke.edu>
* TODO, urlgrabber/grabber.py, urlgrabber/progress.py:
Committing new progress meter code. This new version includes an
emulation of the old meter with pretty good backward compatibility.
However, if people are using custom progress meters, they might
break. This is a good general warning for the next few versions of
urlgrabber, actually. As the threading stuff evolves, the progress
meter may have to change its behavior a bit.
2004-07-21 Ryan Tomayko <rtomayko@naeblis.cx>
* .cvsignore:
Ignoring kdevelop, kate, and anjuta project files. Ignoring
directories created by distutils: build, dist
2004-07-21 Ryan Tomayko <rtomayko@naeblis.cx>
* TODO:
opener related items for ALPHA are completed. Moved stuff related
to getting a release out up into ALPHA section.
2004-07-21 Ryan Tomayko <rtomayko@naeblis.cx>
* setup.py, urlgrabber/__init__.py:
Added distutils trove classifiers to setup.py for eventual
inclusion in PyPI. Added general description string for urlgrabber
package (__init__.py).
2004-07-21 Ryan Tomayko <rtomayko@naeblis.cx>
* urlgrabber/grabber.py:
Fixed bug with urllib2.OpenerDirector caching. Added cache_openers
option (pass to any urlXXX function/method) to control
OpenerDirector caching. The default is to cache openers.
2004-04-02 Michael D. Stenner <mstenner@linux.duke.edu>
* makefile, test/test_grabber.py:
Tidied up the grabber tests so we're not always skipping those ftp
reget tests (the just don't happen now). Also, modified the
makefile so that "make test" runs the testsuite with -v 1 for BOTH
python2.2 and 2.3 if they are available.
2004-03-31 Michael D. Stenner <mstenner@linux.duke.edu>
* LICENSE, maint/license-notice, scripts/urlgrabber,
test/grabberperf.py, test/runtests.py, test/test_byterange.py,
test/test_grabber.py, test/test_keepalive.py, test/test_mirror.py,
test/threading/batchgrabber.py, urlgrabber/byterange.py,
urlgrabber/grabber.py, urlgrabber/keepalive.py,
urlgrabber/mirror.py, urlgrabber/progress.py:
Replace LICENSE with the LGPL. Added maint/license-notice, which
contains the corresponding license-snippet that should be included
at the top of each file. Finally, added this snippet to all of the
.py files. urlgrabber is now fully LGPL-ed.
2004-03-31 Michael D. Stenner <mstenner@linux.duke.edu>
* urlgrabber/grabber.py:
fixed minor typo in opener option docs
2004-03-31 Ryan Tomayko <rtomayko@naeblis.cx>
* test/runtests.py:
Added support for specifying verbosity of unit tests when using
runtests.py. 'python runtests.py --help' for usage.
2004-03-31 Ryan Tomayko <rtomayko@naeblis.cx>
* test/test_grabber.py, urlgrabber/grabber.py:
Added opener kwarg/option to allow overriding of
urllib2.OpenerDirector.
2004-03-31 Michael D. Stenner <mstenner@linux.duke.edu>
* test/munittest.py, test/test_keepalive.py,
urlgrabber/keepalive.py:
Fixed a few bugs related to python 2.3.
2004-03-29 Michael D. Stenner <mstenner@linux.duke.edu>
* TODO:
updated TODO
2004-03-29 Michael D. Stenner <mstenner@linux.duke.edu>
* ChangeLog, makefile:
Modified "make ChangeLog" target so that it doesn't print the
times. If we use local times, then it varies with who rebuilds the
ChangeLog file. Also updated the ChangeLog to the new format.
2004-03-28 Michael D. Stenner <mstenner@linux.duke.edu>
* makefile:
Added "make test" target.
2004-03-28 Michael D. Stenner <mstenner@linux.duke.edu>
* urlgrabber/grabber.py:
Fixed documentation error in the range docs. Temporarily disabled
the opener-caching because it seems to not quite work correctly
yet.
2004-03-28 Michael D. Stenner <mstenner@linux.duke.edu>
* test/test_keepalive.py:
Changed the wait time for the dropped-connection test. The apache
keepalive timeout is 15 seconds, so I set the test to 20.
2004-03-28 Ryan Tomayko <rtomayko@naeblis.cx>
* TODO:
Removed items from TODO list related to options handling as well as
the item on reget not working due to how options are copied.
2004-03-28 Ryan Tomayko <rtomayko@naeblis.cx>
* test/test_grabber.py, urlgrabber/grabber.py:
Modified URLGrabberOptions to use "instance-inheritance" pattern
instead of copying all options.
2004-03-28 Ryan Tomayko <rtomayko@naeblis.cx>
* urlgrabber/mirror.py:
Fixed real small typo in mirror doc string.
2004-03-28 Michael D. Stenner <mstenner@linux.duke.edu>
* urlgrabber/grabber.py:
Tidied up some of the callback code.
2004-03-21 Michael D. Stenner <mstenner@linux.duke.edu>
* ChangeLog, TODO, test/threading/batchgrabber.py,
test/threading/urls-keepalive, test/threading/urls-many:
Updated ChangeLog, TODO. Played with the batchgrabber stuff a
little.
2004-03-21 Michael D. Stenner <mstenner@linux.duke.edu>
* test/test_keepalive.py:
added a simple threading test
2004-03-21 Michael D. Stenner <mstenner@linux.duke.edu>
* urlgrabber/keepalive.py:
Rearranged keepalive a little to make it nicer. Also made it deal
better with dead connections. Now it checks ALL "free"
connections.
2004-03-20 Michael D. Stenner <mstenner@linux.duke.edu>
* TODO, urlgrabber/keepalive.py:
OK, I THINK I've made keepalive thread-friendly. The interface is
completely unchanged. It passes all the existing unittests
(although none of them test threading issues). I need to make some
new tests for thread stuff.
2004-03-19 Michael D. Stenner <mstenner@linux.duke.edu>
* test/grabberperf.py:
Made tests a little cleaner and added "none" test, which is a
reasonable "ideal" test case.
2004-03-19 Ryan Tomayko <rtomayko@naeblis.cx>
* TODO:
Removed item on reducing urllib2 opener director creation.
2004-03-19 Ryan Tomayko <rtomayko@naeblis.cx>
* test/grabberperf.py:
Added proxies to performance tests to ensure that ProxyHandlers are
being added to the urllib2 OpenerDirector.
2004-03-19 Ryan Tomayko <rtomayko@naeblis.cx>
* urlgrabber/grabber.py:
Caching urllib2 ProxyHandler and OpenerDirectors instead of
creating new ones for each request.
2004-03-19 Michael D. Stenner <mstenner@linux.duke.edu>
* TODO, urlgrabber/grabber.py:
In grabber.py, fixed a bug with urlread when no limit is specified.
Made the "natural" file translation work for relative pythnames.
2004-03-19 Michael D. Stenner <mstenner@linux.duke.edu>
* test/: base_test_code.py, test_grabber.py:
Added file and ftp tests for reget.
2004-03-18 Michael D. Stenner <mstenner@linux.duke.edu>
* test/: base_test_code.py, munittest.py, runtests.py,
test_byterange.py, test_grabber.py, test_keepalive.py,
test_mirror.py:
By popular demand, I broke out the unittest coolness into a
separate module (munittest.py) which is largely a drop-in
replacement for unittest.py. The whole test setup is just way cool
at this point.
2004-03-18 Michael D. Stenner <mstenner@linux.duke.edu>
* test/: base_test_code.py, runtests.py, test_byterange.py,
test_grabber.py, test_keepalive.py, test_mirror.py:
Made tests prettier (again... I do like pretty tests) and created
the test result "skip" intended for things like ftp tests where
there may not be a server available. None of the available "ok",
"FAIL" or "ERROR" were really appropriate.
2004-03-17 Michael D. Stenner <mstenner@linux.duke.edu>
* urlgrabber/mirror.py:
Added a little more documentation.
2004-03-17 Michael D. Stenner <mstenner@linux.duke.edu>
* TODO, test/test_mirror.py, urlgrabber/mirror.py:
Made the control of MirrorGroup failover action WAY cooler and more
flexible. Added tests
2004-03-16 Michael D. Stenner <mstenner@linux.duke.edu>
* TODO, test/base_test_code.py, test/test_grabber.py,
urlgrabber/grabber.py:
Changed test base_url according to new web page. Added prefix
argument to URLGrabber, along with docs and test. Also documented
URLGrabber failure_callback.
2004-03-15 Michael D. Stenner <mstenner@linux.duke.edu>
* TODO:
updated TODO after alpha-huddle
2004-03-15 Ryan Tomayko <rtomayko@naeblis.cx>
* test/test_grabber.py:
Grabber tests now call package level urlgrab, urlopen, urlread
functions.
2004-03-15 Ryan Tomayko <rtomayko@naeblis.cx>
* urlgrabber/__init__.py:
Added urlgrab, urlopen, urlread as package level exports.
2004-03-14 Michael D. Stenner <mstenner@linux.duke.edu>
* ChangeLog, scripts/urlgrabber, urlgrabber/byterange.py,
urlgrabber/grabber.py, urlgrabber/keepalive.py,
urlgrabber/mirror.py:
Added Id: strings to distributed python files, made grabber.py use
the global version information from __init__.py, and updated
copyright statements to include 2004 :)
2004-03-14 Michael D. Stenner <mstenner@linux.duke.edu>
* TODO:
updated TODO
2004-03-14 Michael D. Stenner <mstenner@linux.duke.edu>
* setup.py, urlgrabber/__init__.py:
Put version, date, url, and author information into __init__.py and
have setup.py read them in from there. Also put Id: tag into
__init__.py, which we should probably do in the other files as
well.
2004-03-13 Michael D. Stenner <mstenner@linux.duke.edu>
* test/test_mirror.py, urlgrabber/mirror.py:
improved mirror.py docs, added MGRandomStart (which is equivalent
to yum's roundrobin failover policy) and MGRandomOrder. Added
associated tests for these new classes.
2004-03-13 Michael D. Stenner <mstenner@linux.duke.edu>
* test/test_byterange.py:
converted test_byterange.py to use base_test_code
2004-03-13 Michael D. Stenner <mstenner@linux.duke.edu>
* TODO:
updated TODO
2004-03-13 Michael D. Stenner <mstenner@linux.duke.edu>
* test/: base_test_code.py, test_grabber.py, test_keepalive.py:
making tests a little nicer
2004-03-13 Michael D. Stenner <mstenner@linux.duke.edu>
* test/: base_test_code.py, test_grabber.py, test_mirror.py:
added callback test to test_mirror.py. Restructuring the test code
a little. Starting to move common definitions (like the reference
data) into base_test_code.py, which can be accessed from the other
test modules. Also defined UGTestCase which prints things slightly
more prettily.
Also, I'd like to go on record as saying that I absolutely hate the
__attribute munging in python, and I don't think it should ever be
used. Thank you for your attention.
2004-03-13 Michael D. Stenner <mstenner@linux.duke.edu>
* urlgrabber/: grabber.py, mirror.py:
added "failure_callback" options to both MirrorGroup and
URLGrabber. These allow the calling application to know when a
retry/failover is occurring and what caused it (the exception is
passed to the callback).
2004-03-13 Michael D. Stenner <mstenner@linux.duke.edu>
* test/: runtests.py, test_grabber.py, test_keepalive.py:
added keepalive tests; fixed test_bad_reget_type
2004-03-12 Michael D. Stenner <mstenner@linux.duke.edu>
* urlgrabber/grabber.py:
Fixed a stupid typo in the splituser fix (from 5 minutes ago)
2004-03-12 Michael D. Stenner <mstenner@linux.duke.edu>
* urlgrabber/.cvsignore:
Added .cvsignore to urlgrabber/ directory
2004-03-12 Michael D. Stenner <mstenner@linux.duke.edu>
* TODO:
Updated TODO.
2004-03-12 Michael D. Stenner <mstenner@linux.duke.edu>
* urlgrabber/: grabber.py, mirror.py:
Fixed a typo in mirror.py. Added reget docs and reformatted
module-level docs in grabber.py. Also fixed a "bug" in grabber.py
that came about because the python 2.3 urllib2 no longer includes
splituser and splitpasswd.
2004-03-12 Ryan Tomayko <rtomayko@naeblis.cx>
* test/test_grabber.py:
Using a tempfile for urlgrab test instead of named file in cwd.
2004-03-12 Ryan Tomayko <rtomayko@naeblis.cx>
* urlgrabber/grabber.py:
Removed ugly code that was suppose to allow a file-object to be
passed to urlgrab as the output file.
2004-03-12 Ryan Tomayko <rtomayko@naeblis.cx>
* test/runtests.py:
Added mirror tests to main test suite.
2004-03-12 Michael D. Stenner <mstenner@linux.duke.edu>
* test/test_grabber.py, urlgrabber/grabber.py:
Added first attempt at reget support. Created test suite for http.
2004-03-11 Michael D. Stenner <mstenner@linux.duke.edu>
* ChangeLog, MANIFEST.in, README, makefile, setup.py,
maint/cvs2cl.pl, maint/usermap:
Updated MANIFEST.in and README. Created 'maint' directory
containing cvs2cl.pl and usermap file (to make usernames to email
address for cvs2cl.pl). Also created a "ChangeLog" target in the
makefile to build the Changelog.
2004-03-11 Michael D. Stenner <mstenner@linux.duke.edu>
* TODO, urlgrabber/keepalive.py:
[TODO] added not about restructuring import style
[keepalive.py] improved exception-handling and made it a bit more
fault-tolerant. Specifically, it now better addresses
dropped-connections. Also added a dropped-connection test to the
test code in keepalive.py. For now, I'd like to keep this code in
keepalive.py for convenient diagnostics from users.
2004-03-10 Michael D. Stenner <mstenner@linux.duke.edu>
* TODO, test/test_mirror.py, urlgrabber/grabber.py,
urlgrabber/keepalive.py, urlgrabber/mirror.py:
Added mirror code and associated test code.
[grabber.py] Edited URLGrabError doc string to reflect MirrorGroup
error code and new error code policy.
[TODO] Moved reget to ALPHA (from ALPHA 2)
[keepalive.py] fixed problem with the new python 2.3 httplib. They
now raise BadStatusLine from a new place.
2004-03-10 Ryan Tomayko <rtomayko@naeblis.cx>
* test/: grabberperf.py, test_byterange.py, test_grabber.py:
Fixed up tests to work with new directory layout. Note: most
sys.path hackery has been removed so you will need to setup
PYTHONPATH manually before running test scripts.
2004-03-10 Ryan Tomayko <rtomayko@naeblis.cx>
* setup.py:
Fixed up for new cvs structure. Tried to simplify a bit while I was
here.
2004-03-10 Michael D. Stenner <mstenner@linux.duke.edu>
* ChangeLog:
Edited the cvs repo to restructure a little bit. All module python
code (grabber.py, byterange.py, progress.py, keepalive.py, and
__init__.py) have been moved into a "urlgrabber" directory.
Removed the ChangeLog file from the repo and created a new empty
one. In the intial (empty) version, I included the old logs from
before the repo-merge. Therefore, those logs appear as a single
log-entry. It's a little bit of a hack, but it's good enough. All
the info is there, and it allows us to move forward without having
to keep special-casing those old logs.
All future changelogs will be generated with cvs2cs.pl.
Other than these actions, I didn't do anything to any files.
Therefore, a number of things are certainly broken, as they expect
things to be in other places. Things I can think of: test code,
setup.py.
2004-03-08 Ryan Tomayko <rtomayko@naeblis.cx>
* TODO:
raise exceptions for invalid ranges DONE.
2004-03-08 Ryan Tomayko <rtomayko@naeblis.cx>
* urlgrabber/grabber.py:
Added support for RangeError to URLGrabError handling (errno 9).
Also, default is _not_ to retry grabs resulting in RangeError.
2004-03-08 Ryan Tomayko <rtomayko@naeblis.cx>
* urlgrabber/byterange.py:
Fixed try/except without an explicit exception. that's a no-no.
2004-03-08 Ryan Tomayko <rtomayko@naeblis.cx>
* urlgrabber/byterange.py:
raising RangeError anytime a range is non-satisfiable.
2004-03-08 Ryan Tomayko <rtomayko@naeblis.cx>
* urlgrabber/grabber.py:
Always copy_local when a byte range is specified to urlgrab.
2004-03-08 Ryan Tomayko <rtomayko@naeblis.cx>
* TODO:
urllib2 conditionals were cleaned up a while ago..
2004-03-08 Ryan Tomayko <rtomayko@naeblis.cx>
* TODO:
Confirmed FTP connections are being closed properly when using
ranges. Added item for using CacheFTPHandler so FTP connections
are reused. Moved reget item under ALPHA 2. Added items for
keepalive/progress_meter w/ multiple threads. viewcvs DONE. Test
under multiple threads DONE. Basic performance tests DONE.
2004-03-06 Ryan Tomayko <rtomayko@naeblis.cx>
* test/threading/: urls-keepalive, urls-many:
Some URLs for batchgrabber.py.
2004-03-06 Ryan Tomayko <rtomayko@naeblis.cx>
* test/threading/batchgrabber.py:
Module for testing urlgrabber w/ multiple threads.
2004-03-01 Ryan Tomayko <rtomayko@naeblis.cx>
* urlgrabber/grabber.py:
Performance tweak to bypass URLGrabberFileObject.read when not
using progress_obj or throttle.
2004-03-01 Ryan Tomayko <rtomayko@naeblis.cx>
* urlgrabber/grabber.py:
Removed speed test from grabber.py. This test has been moved to
test/grabberperf.py
2004-03-01 Ryan Tomayko <rtomayko@naeblis.cx>
* test/grabberperf.py:
First cut at a module to run through some performance tests for
grabber.py
2004-03-01 Ryan Tomayko <rtomayko@naeblis.cx>
* test/test_grabber.py:
Clean up temp files after testing urlgrab.
2004-02-28 Ryan Tomayko <rtomayko@naeblis.cx>
* TODO:
Added reget and batch stuff under the 'maybe' section.
2004-02-22 Ryan Tomayko <rtomayko@naeblis.cx>
* urlgrabber/grabber.py:
Brought in fix from yum for user:pass escaping in URLs.
2004-02-22 Ryan Tomayko <rtomayko@naeblis.cx>
* TODO, urlgrabber/grabber.py:
Added checkfunc support to urlread.
2004-02-22 Ryan Tomayko <rtomayko@naeblis.cx>
* TODO, urlgrabber/byterange.py:
Changed all variables/args named 'range' to rangetup or brange to
avoid possible python keyword clash.
2004-02-22 Ryan Tomayko <rtomayko@naeblis.cx>
* TODO:
Added item about handling KeyboardInterupt exceptions.
2004-02-22 Ryan Tomayko <rtomayko@naeblis.cx>
* TODO:
Restructured TODO list based on recent email from Michael.
2004-02-14 Ryan Tomayko <rtomayko@naeblis.cx>
* .cvsignore, MANIFEST.in, README, TODO, progress_meter.py,
setup.py, urlgrabber.py, scripts/urlgrabber, test/.cvsignore,
test/runtests.py, test/test_byterange.py, test/test_grabber.py,
test/test_urlgrabber.py, urlgrabber/__init__.py,
urlgrabber/byterange.py, urlgrabber/grabber.py,
urlgrabber/progress.py:
Mass commit/merge of changes made for urlgrabber restructuring..
See ChangeLog for details.
2004-02-11 Michael D. Stenner <mstenner@linux.duke.edu>
* ChangeLog:
These are the changes from when ryan was working in a separate
repo. They are included here in bulk for ChangeLog simplicity.
2004-02-09 Friday rtomayko
* TODO: Initial commit of TODO list..
2004-01-29 Thursday rtomayko
* grabber.py, grabbertests.py, urlgrabber: Added proxy support.
* grabber.py: Removed default option copying in URLGrabber class
because this could cause side-effects.
2004-01-25 Sunday rtomayko
* setup.py: Fixed bad syntax on doc files..
* setup.py, urlgrabber: Added urlgrabber script to distutils
install.
2003-12-16 Tuesday rtomayko
* byterange.py, grabber.py: Initial crack at range support for
FTP - working but ugly..
2003-12-12 Friday rtomayko
* byterange.py, byterangetests.py: Make range tuples work like
python slices. endpos is non-inclusive.
2003-12-10 Wednesday rtomayko
* scripts/urlgrabber: Quick script to call urlgrabber from the
command line. I've been using it in place of wget to get as
much testing as possible.
* grabber.py: Fixed small problem with progress meter not
updating properly. This was caused by moving _do_grab to
URLGrabberFileObject.
* grabber.py: Moved _do_open and per-request variables into
URLGrabberFileObject
* grabber.py: Added doc for user_agent kwarg
* grabber.py: Moved urllib2 Handler configuration to
URLGrabberFileObject. Each request has a
urllib2.OpenerDirector now that is customized based on
options.
* grabber.py: Added keepalive kwarg. keepalive can be
specified on a per-request basis now.
* grabber.py: Changed method of handling retries. I'm not
sure if I like this yet. All URLGrabber.urlXXX methods
define a function and pass it to the URLGrabber._retry()
method which calls the function provided and handles
retrying if the provided function raises a URLGrabError.
2003-12-09 Tuesday rtomayko
* grabber.py: default_grabber is now a default instance of
URLGrabber used by the module level urlXXX functions.
* grabber.py: deprecated module level set_<option> methods.
default_grabber should be used instead.
* grabber.py: URLGrabber should be completely stateless now.
Using a single URLGrabber object for multiple requests
shouldn't be a problem.
* grabber.py: Added URLGrabberOptions class to handle kwargs
more elegantly.
2003-12-08 Monday rtomayko
* __init__.py, setup.py: everything should be in a package named
urlgrabber now (distutils is a nice piece of software).
* grabber.py, range.py, rangetests.py, grabbertests.py: - Initial
attempt at range support for HTTP and local files through range
module.
* byterange.py: brought in skeleton classes from urllib2 for FTP
range support but FTP ranges are not yet supported.
* byterange.py: Local file range support internally implemented
with RangeableFileObject which is a pretty cool file object
wrapper that automatically handles ranging.
2003-12-07 Sunday rtomayko
* grabber.py: minor changes for line wrapping
* grabber.py: added URLGrabber class (moved _do_open, _do_grab,
_parse_url, _get_raw_throttle into said class).
* grabber.py: deprecated retrygrab. now supported through
urlgrab(retry=n).
* grabber.py: local filenames (i.e. without URL scheme
specifier) are now handled wherever a url may be passed.
* grabber.py: retry is now fully supported on urlread and
partially supported on urlopen. wrt to urlopen, retries are
possible if an URLGrabError occurs while _opening_ the url
(e.g. due to temporarily unavailable, etc). however, we
still can't protect anyone from problems that occur after
urlopen has returned the file object.
2003-12-05 Friday rtomayko
* grabber.py: Initial attempt at retooling the interface to use
kwargs.. *Most* functionality should still be working..
2003-10-13 Michael D. Stenner <mstenner@linux.duke.edu>
* urlgrabber.py, test/test_urlgrabber.py:
created test_urlgrabber.py and started moving tests into it
2003-10-12 Michael D. Stenner <mstenner@linux.duke.edu>
* README, makefile, progress_meter.py, urlgrabber.py,
urlgrabber/keepalive.py, LICENSE, MANIFEST.in, setup.py:
Initial revision
2003-10-12 Michael D. Stenner <mstenner@linux.duke.edu>
* README, makefile, progress_meter.py, urlgrabber.py,
urlgrabber/keepalive.py, LICENSE, MANIFEST.in, setup.py:
importing urlgrabber
|