Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
50 most recent check-ins
2020-10-21
| ||
15:06 | Merged bug fixes and improvements from branch bwt_fixes. Leaf check-in: 201f429acb user: thomas tags: trunk | |
15:00 | Marked scripts in bin/ as executable. (Fixed issue caused by importing from bzr.) check-in: 7ec9fbe091 user: thomas tags: trunk | |
14:52 | Ignore __pycache__ directories. check-in: 1c6e79d7dc user: thomas tags: trunk | |
14:51 | Ported brzignore file to fossil. check-in: 116c9fabd8 user: thomas tags: trunk | |
2019-03-27
| ||
12:55 | Code cleanup: - Marked CL kernels as private attributes - Renamed RadixSort class to SegmentedRadixSort, because it implements a segmented sorting algorithm. - Defined __all__ in burrows_wheeler.py to mark BurrowsWheelerEncoder as the only public definition. Closed-Leaf check-in: c78469149c user: thomas.hess@udo.edu tags: bwt_fixes | |
2019-03-21
| ||
14:22 | Fixed style issues (PEP8 weak warnings) and typos in comments. check-in: a001ecd589 user: thomas.hess@udo.edu tags: bwt_fixes | |
13:54 | Optimization: Do not resolve the input block three times inside the update_segment_boundary_kernel. The kernel now takes the sorted and resolved input_block from the RadixSort implementation instead. This reduces the kernel parameter count, streamlines the algorithm data flow and reduces the random data reads inside the kernel. check-in: 6f9c9c5452 user: thomas.hess@udo.edu tags: bwt_fixes | |
2019-03-14
| ||
23:33 | Rewritten the BWT to use a segmented RadixSort implementation. The new sorting implementation allows to perform a segmented sorting operation on the whole input block using one RadixSort kernel enqueue. Removed the host-side iteration over to-be sorted segments. The RadixSort honors the segments and sorts all segments in one sorting operation. Removed the now obsoleted IndexOf implementation. Removed the now obsolete InsertionSort implementations. Removed the bwt_sorter.py module and related tests. The RadixSort class is moved to burrows_wheeler.py The new implementation further reduces the kernel enqueue count and speeds up the BWT execution. check-in: 3f6a503268 user: thomas.hess@udo.edu tags: bwt_fixes | |
2019-03-13
| ||
21:48 | Optimized the RadixSort implementation: Dropped the input block resolution, because the resolved input block is now available in the outer algorithm. Instead of duplicating the work, re-use the already available data. check-in: 947f2a7146 user: thomas.hess@udo.edu tags: bwt_fixes | |
20:44 | Optimization: Replaced the InsertionSort kernel with a SegmentInsertionSort CL kernel. The new kernel implements a segmented sort using the previously written InsertionSort algorithm. This cuts the kernel call count down, from > 30000 kernel calls on a 100000k input array to ~300 kernel calls. Removed now broken tests, that tested the old and obsoleted InsertionSort interface. check-in: eb80130698 user: thomas.hess@udo.edu tags: bwt_fixes | |
2019-03-05
| ||
13:19 | Added the BurrowsWheelerEncoder to the benchmark script. check-in: fe9d05f2b9 user: thomas.hess@udo.edu tags: bwt_fixes | |
13:19 | Fixed typo in RLE1 docstring. check-in: 592a9a6d7b user: thomas.hess@udo.edu tags: bwt_fixes | |
2019-03-04
| ||
20:24 | Performance optimization: Create one InsertionSort kernel on instantiation. Previously, a new kernel was created for each sort operation. Now, only one kernel is created and re-used, which cuts the execution time in half. check-in: 3f3be7baff user: thomas.hess@udo.edu tags: bwt_fixes | |
16:39 | Fixed an issue with the RadixSort implementation. The data preparation now correctly resolves the input block indirection, without corrupting the index array. Removed some debug logging and did some minor style improvements. Added some more comments. check-in: ae62d72761 user: thomas.hess@udo.edu tags: bwt_fixes | |
2019-03-02
| ||
22:37 | Fixed the issue in the RadixSort implementation. All available BWT related tests pass. check-in: 790c778655 user: thomas.hess@udo.edu tags: bwt_fixes | |
11:29 | Changed the CopyIf algorithm to an IndexOf algorithm: The previous copy_if implementation was crafted to mimic an index_of operation, by feeding an array containing it’s own indices. (arr[i] == i) The change: - Reduces memory overhead of about 3.5MiB by not requiring an arange array with len(array) > 900_000 - Simplified the code - Lifted length limitation of the BWT algorithm. It can now potentially transform any input array of size <= 2GiB. check-in: 48470c947c user: thomas.hess@udo.edu tags: bwt_fixes | |
2019-02-28
| ||
13:20 | Added temporary logging to the RadixSort implementation. check-in: 6e4498c4a8 user: thomas.hess@udo.edu tags: bwt_fixes | |
11:40 | Added another RadixSort test case check-in: b983aa99d8 user: thomas.hess@udo.edu tags: bwt_fixes | |
11:14 | RadixSort: Use an additional CL kernel to resolve the input indirection prior to sorting. This corrects the sorting behaviour and should speed up the sorting. Improved the BWT test output in case of failures. check-in: c6239d1cdd user: thomas.hess@udo.edu tags: bwt_fixes | |
10:50 | Updated logging, marked untouched CL kernel parameter as const. check-in: 811173a30d user: thomas.hess@udo.edu tags: bwt_fixes | |
2019-02-26
| ||
13:19 | Added further tests for the sorting algorithms. check-in: ed3c4171c8 user: thomas.hess@udo.edu tags: bwt_fixes | |
2019-02-25
| ||
23:31 | Prevent PyOpenCL from hogging all system memory. Implemented a custom copy_if function that does not use the library supplied function, which hogs massive amount of RAM due to aggressive kernel caching of internally generated CL kernels. check-in: c42ffe6feb user: thomas.hess@udo.edu tags: bwt_fixes | |
19:28 | Added test cases that isolate the issue with the RadixSort implementation. check-in: cd8f77e2ea user: thomas.hess@udo.edu tags: bwt_fixes | |
18:40 | Added GPLv3 header to test modules. Moved duplicate test function definition into a common module. check-in: af6254931c user: thomas.hess@udo.edu tags: bwt_fixes | |
18:38 | Removed the outdated and unneeded tests/context.py module. check-in: c90cde022c user: thomas.hess@udo.edu tags: bwt_fixes | |
18:38 | Added GPLv3 header to all encoder modules. check-in: 8f182e85df user: thomas.hess@udo.edu tags: bwt_fixes | |
18:09 | Use Python sets to pass the segment boundaries. Store each set for one iteration to determine the algorithm progress. check-in: f0545c191f user: thomas.hess@udo.edu tags: bwt_fixes | |
18:02 | Added test cases that enforce the usage of the broken RadixSort implementation. check-in: 0e7897250f user: thomas.hess@udo.edu tags: bwt_fixes | |
17:54 | Split BWT test function into multiple functions, each implementing a stage during the test run. This will ease adding further tests. check-in: 174d850f91 user: thomas.hess@udo.edu tags: bwt_fixes | |
14:36 | BWT sorting algorithms: Reduced function parameter count. check-in: 610c00aeef user: thomas.hess@udo.edu tags: trunk | |
14:34 | InsertionSort: Switch to signed array indices. This fixes a compiler warning emitted on Intel and AMD platforms. It prevents crashes in certain cases. check-in: 28f2f9b457 user: thomas.hess@udo.edu tags: trunk | |
14:32 | Tests: Tell PyOpenCL to emit OCL compiler warnings as Python warnings. check-in: 58459e0a02 user: thomas.hess@udo.edu tags: trunk | |
2019-02-22
| ||
01:26 | BWT: Improved data flow. Fixed typo in a variable name. Download data earlier, if the data is only used on the host later. Updated the Logger in bwt_sorter.py. check-in: 6f805874fa user: thomas.hess@udo.edu tags: trunk | |
00:57 | Cleared up the BurrowsWheelerEncoder class: - Removed much of the logging - Use an appropriate sorting algorithm for different segment sizes. - Removed unused variables, cleaned up the code style. check-in: 61b1efb471 user: thomas.hess@udo.edu tags: trunk | |
00:56 | BWT: Added two test cases, testing equal character inputs. check-in: 0214ee175c user: thomas.hess@udo.edu tags: trunk | |
00:55 | InsertionSort: Added a warning that triggers if the algorithm gets used on large input. check-in: c8fa3ce2f5 user: thomas.hess@udo.edu tags: trunk | |
00:28 | Refactored the sorting algorithms into a common module. Both the RadixSort and InsertionSort have a separate class, sharing much of the Python logic code using a common base class. check-in: d64e24385a user: thomas.hess@udo.edu tags: trunk | |
2019-02-21
| ||
22:19 | Implemented a first working BWT version. It needs optimizing, removal of excessive logging and synchronisation. The current implementation uses the serial InsertionSort algorithm only. check-in: 32945542b5 user: thomas.hess@udo.edu tags: trunk | |
21:40 | BWT tests: Removed an invalid test case. The test data assumed a different sort order, thus the expected data differed. Removed a test of a private method. check-in: b3cd3e8b66 user: thomas.hess@udo.edu tags: trunk | |
11:21 | Added logging messages to the Bz2ClBase and InputUploader classes. check-in: 54967a6eb5 user: thomas.hess@udo.edu tags: trunk | |
2019-02-18
| ||
11:28 | RunLengthEncoder1: Added logging output to the RLE encoder. check-in: 902f5a46bc user: thomas.hess@udo.edu tags: trunk | |
11:28 | BWT: improved test code. check-in: 685769835c user: thomas.hess@udo.edu tags: trunk | |
10:47 | Added some logging messages to the InputUploader class. check-in: 7a1b2ae57e user: thomas.hess@udo.edu tags: trunk | |
10:47 | Fixed a bug in the MemCopy CL kernel cache. Kernel lookup now works as expected. check-in: 16763b9fd4 user: thomas.hess@udo.edu tags: trunk | |
10:46 | InputUploader: Add tests for the new block size validation. check-in: 09d110f426 user: thomas.hess@udo.edu tags: trunk | |
10:45 | Added a logger framework. check-in: 7397a70545 user: thomas.hess@udo.edu tags: trunk | |
10:12 | InputUploader: Extended block_size validations. Extended documentation. check-in: 9994aa036f user: thomas.hess@udo.edu tags: trunk | |
2019-02-14
| ||
19:09 | InputUploader: Extended documentation check-in: adacb7753b user: thomas.hess@udo.edu tags: trunk | |
14:34 | RLE1: Improved code documentation. check-in: c36c9ef777 user: thomas.hess@udo.edu tags: trunk | |
13:40 | Tests: Added test data sanity checks for the BWT test data. This should catch invalid test data, that is invalid due to simple typos. check-in: db859f076e user: thomas.hess@udo.edu tags: trunk | |