vendor
This commit is contained in:
315
vendor/github.com/ulikunitz/xz/TODO.md
generated
vendored
Normal file
315
vendor/github.com/ulikunitz/xz/TODO.md
generated
vendored
Normal file
@ -0,0 +1,315 @@
|
||||
# TODO list
|
||||
|
||||
## Release v0.6
|
||||
|
||||
1. Review encoder and check for lzma improvements under xz.
|
||||
2. Fix binary tree matcher.
|
||||
3. Compare compression ratio with xz tool using comparable parameters
|
||||
and optimize parameters
|
||||
4. Do some optimizations
|
||||
- rename operation action and make it a simple type of size 8
|
||||
- make maxMatches, wordSize parameters
|
||||
- stop searching after a certain length is found (parameter sweetLen)
|
||||
|
||||
## Release v0.7
|
||||
|
||||
1. Optimize code
|
||||
2. Do statistical analysis to get linear presets.
|
||||
3. Test sync.Pool compatability for xz and lzma Writer and Reader
|
||||
3. Fuzz optimized code.
|
||||
|
||||
## Release v0.8
|
||||
|
||||
1. Support parallel go routines for writing and reading xz files.
|
||||
2. Support a ReaderAt interface for xz files with small block sizes.
|
||||
3. Improve compatibility between gxz and xz
|
||||
4. Provide manual page for gxz
|
||||
|
||||
## Release v0.9
|
||||
|
||||
1. Improve documentation
|
||||
2. Fuzz again
|
||||
|
||||
## Release v1.0
|
||||
|
||||
1. Full functioning gxz
|
||||
2. Add godoc URL to README.md (godoc.org)
|
||||
3. Resolve all issues.
|
||||
4. Define release candidates.
|
||||
5. Public announcement.
|
||||
|
||||
## Package lzma
|
||||
|
||||
### Release v0.6
|
||||
|
||||
- Rewrite Encoder into a simple greedy one-op-at-a-time encoder
|
||||
including
|
||||
+ simple scan at the dictionary head for the same byte
|
||||
+ use the killer byte (requiring matches to get longer, the first
|
||||
test should be the byte that would make the match longer)
|
||||
|
||||
|
||||
## Optimizations
|
||||
|
||||
- There may be a lot of false sharing in lzma.State; check whether this
|
||||
can be improved by reorganizing the internal structure of it.
|
||||
- Check whether batching encoding and decoding improves speed.
|
||||
|
||||
### DAG optimizations
|
||||
|
||||
- Use full buffer to create minimal bit-length above range encoder.
|
||||
- Might be too slow (see v0.4)
|
||||
|
||||
### Different match finders
|
||||
|
||||
- hashes with 2, 3 characters additional to 4 characters
|
||||
- binary trees with 2-7 characters (uint64 as key, use uint32 as
|
||||
pointers into a an array)
|
||||
- rb-trees with 2-7 characters (uint64 as key, use uint32 as pointers
|
||||
into an array with bit-steeling for the colors)
|
||||
|
||||
## Release Procedure
|
||||
|
||||
- execute goch -l for all packages; probably with lower param like 0.5.
|
||||
- check orthography with gospell
|
||||
- Write release notes in doc/relnotes.
|
||||
- Update README.md
|
||||
- xb copyright . in xz directory to ensure all new files have Copyright
|
||||
header
|
||||
- VERSION=<version> go generate github.com/ulikunitz/xz/... to update
|
||||
version files
|
||||
- Execute test for Linux/amd64, Linux/x86 and Windows/amd64.
|
||||
- Update TODO.md - write short log entry
|
||||
- git checkout master && git merge dev
|
||||
- git tag -a <version>
|
||||
- git push
|
||||
|
||||
## Log
|
||||
|
||||
### 2017-06-05
|
||||
|
||||
Release v0.5.4 fixes issues #15 of another problem with the padding size
|
||||
check for the xz block header. I removed the check completely.
|
||||
|
||||
### 2017-02-15
|
||||
|
||||
Release v0.5.3 fixes issue #12 regarding the decompression of an empty
|
||||
XZ stream. Many thanks to Tomasz Kłak, who reported the issue.
|
||||
|
||||
### 2016-12-02
|
||||
|
||||
Release v0.5.2 became necessary to allow the decoding of xz files with
|
||||
4-byte padding in the block header. Many thanks to Greg, who reported
|
||||
the issue.
|
||||
|
||||
### 2016-07-23
|
||||
|
||||
Release v0.5.1 became necessary to fix problems with 32-bit platforms.
|
||||
Many thanks to Bruno Brigas, who reported the issue.
|
||||
|
||||
### 2016-07-04
|
||||
|
||||
Release v0.5 provides improvements to the compressor and provides support for
|
||||
the decompression of xz files with multiple xz streams.
|
||||
|
||||
### 2016-01-31
|
||||
|
||||
Another compression rate increase by checking the byte at length of the
|
||||
best match first, before checking the whole prefix. This makes the
|
||||
compressor even faster. We have now a large time budget to beat the
|
||||
compression ratio of the xz tool. For enwik8 we have now over 40 seconds
|
||||
to reduce the compressed file size for another 7 MiB.
|
||||
|
||||
### 2016-01-30
|
||||
|
||||
I simplified the encoder. Speed and compression rate increased
|
||||
dramatically. A high compression rate affects also the decompression
|
||||
speed. The approach with the buffer and optimizing for operation
|
||||
compression rate has not been successful. Going for the maximum length
|
||||
appears to be the best approach.
|
||||
|
||||
### 2016-01-28
|
||||
|
||||
The release v0.4 is ready. It provides a working xz implementation,
|
||||
which is rather slow, but works and is interoperable with the xz tool.
|
||||
It is an important milestone.
|
||||
|
||||
### 2016-01-10
|
||||
|
||||
I have the first working implementation of an xz reader and writer. I'm
|
||||
happy about reaching this milestone.
|
||||
|
||||
### 2015-12-02
|
||||
|
||||
I'm now ready to implement xz because, I have a working LZMA2
|
||||
implementation. I decided today that v0.4 will use the slow encoder
|
||||
using the operations buffer to be able to go back, if I intend to do so.
|
||||
|
||||
### 2015-10-21
|
||||
|
||||
I have restarted the work on the library. While trying to implement
|
||||
LZMA2, I discovered that I need to resimplify the encoder and decoder
|
||||
functions. The option approach is too complicated. Using a limited byte
|
||||
writer and not caring for written bytes at all and not to try to handle
|
||||
uncompressed data simplifies the LZMA encoder and decoder much.
|
||||
Processing uncompressed data and handling limits is a feature of the
|
||||
LZMA2 format not of LZMA.
|
||||
|
||||
I learned an interesting method from the LZO format. If the last copy is
|
||||
too far away they are moving the head one 2 bytes and not 1 byte to
|
||||
reduce processing times.
|
||||
|
||||
### 2015-08-26
|
||||
|
||||
I have now reimplemented the lzma package. The code is reasonably fast,
|
||||
but can still be optimized. The next step is to implement LZMA2 and then
|
||||
xz.
|
||||
|
||||
### 2015-07-05
|
||||
|
||||
Created release v0.3. The version is the foundation for a full xz
|
||||
implementation that is the target of v0.4.
|
||||
|
||||
### 2015-06-11
|
||||
|
||||
The gflag package has been developed because I couldn't use flag and
|
||||
pflag for a fully compatible support of gzip's and lzma's options. It
|
||||
seems to work now quite nicely.
|
||||
|
||||
### 2015-06-05
|
||||
|
||||
The overflow issue was interesting to research, however Henry S. Warren
|
||||
Jr. Hacker's Delight book was very helpful as usual and had the issue
|
||||
explained perfectly. Fefe's information on his website was based on the
|
||||
C FAQ and quite bad, because it didn't address the issue of -MININT ==
|
||||
MININT.
|
||||
|
||||
### 2015-06-04
|
||||
|
||||
It has been a productive day. I improved the interface of lzma.Reader
|
||||
and lzma.Writer and fixed the error handling.
|
||||
|
||||
### 2015-06-01
|
||||
|
||||
By computing the bit length of the LZMA operations I was able to
|
||||
improve the greedy algorithm implementation. By using an 8 MByte buffer
|
||||
the compression rate was not as good as for xz but already better then
|
||||
gzip default.
|
||||
|
||||
Compression is currently slow, but this is something we will be able to
|
||||
improve over time.
|
||||
|
||||
### 2015-05-26
|
||||
|
||||
Checked the license of ogier/pflag. The binary lzmago binary should
|
||||
include the license terms for the pflag library.
|
||||
|
||||
I added the endorsement clause as used by Google for the Go sources the
|
||||
LICENSE file.
|
||||
|
||||
### 2015-05-22
|
||||
|
||||
The package lzb contains now the basic implementation for creating or
|
||||
reading LZMA byte streams. It allows the support for the implementation
|
||||
of the DAG-shortest-path algorithm for the compression function.
|
||||
|
||||
### 2015-04-23
|
||||
|
||||
Completed yesterday the lzbase classes. I'm a little bit concerned that
|
||||
using the components may require too much code, but on the other hand
|
||||
there is a lot of flexibility.
|
||||
|
||||
### 2015-04-22
|
||||
|
||||
Implemented Reader and Writer during the Bayern game against Porto. The
|
||||
second half gave me enough time.
|
||||
|
||||
### 2015-04-21
|
||||
|
||||
While showering today morning I discovered that the design for OpEncoder
|
||||
and OpDecoder doesn't work, because encoding/decoding might depend on
|
||||
the current status of the dictionary. This is not exactly the right way
|
||||
to start the day.
|
||||
|
||||
Therefore we need to keep the Reader and Writer design. This time around
|
||||
we simplify it by ignoring size limits. These can be added by wrappers
|
||||
around the Reader and Writer interfaces. The Parameters type isn't
|
||||
needed anymore.
|
||||
|
||||
However I will implement a ReaderState and WriterState type to use
|
||||
static typing to ensure the right State object is combined with the
|
||||
right lzbase.Reader and lzbase.Writer.
|
||||
|
||||
As a start I have implemented ReaderState and WriterState to ensure
|
||||
that the state for reading is only used by readers and WriterState only
|
||||
used by Writers.
|
||||
|
||||
### 2015-04-20
|
||||
|
||||
Today I implemented the OpDecoder and tested OpEncoder and OpDecoder.
|
||||
|
||||
### 2015-04-08
|
||||
|
||||
Came up with a new simplified design for lzbase. I implemented already
|
||||
the type State that replaces OpCodec.
|
||||
|
||||
### 2015-04-06
|
||||
|
||||
The new lzma package is now fully usable and lzmago is using it now. The
|
||||
old lzma package has been completely removed.
|
||||
|
||||
### 2015-04-05
|
||||
|
||||
Implemented lzma.Reader and tested it.
|
||||
|
||||
### 2015-04-04
|
||||
|
||||
Implemented baseReader by adapting code form lzma.Reader.
|
||||
|
||||
### 2015-04-03
|
||||
|
||||
The opCodec has been copied yesterday to lzma2. opCodec has a high
|
||||
number of dependencies on other files in lzma2. Therefore I had to copy
|
||||
almost all files from lzma.
|
||||
|
||||
### 2015-03-31
|
||||
|
||||
Removed only a TODO item.
|
||||
|
||||
However in Francesco Campoy's presentation "Go for Javaneros
|
||||
(Javaïstes?)" is the the idea that using an embedded field E, all the
|
||||
methods of E will be defined on T. If E is an interface T satisfies E.
|
||||
|
||||
https://talks.golang.org/2014/go4java.slide#51
|
||||
|
||||
I have never used this, but it seems to be a cool idea.
|
||||
|
||||
### 2015-03-30
|
||||
|
||||
Finished the type writerDict and wrote a simple test.
|
||||
|
||||
### 2015-03-25
|
||||
|
||||
I started to implement the writerDict.
|
||||
|
||||
### 2015-03-24
|
||||
|
||||
After thinking long about the LZMA2 code and several false starts, I
|
||||
have now a plan to create a self-sufficient lzma2 package that supports
|
||||
the classic LZMA format as well as LZMA2. The core idea is to support a
|
||||
baseReader and baseWriter type that support the basic LZMA stream
|
||||
without any headers. Both types must support the reuse of dictionaries
|
||||
and the opCodec.
|
||||
|
||||
### 2015-01-10
|
||||
|
||||
1. Implemented simple lzmago tool
|
||||
2. Tested tool against large 4.4G file
|
||||
- compression worked correctly; tested decompression with lzma
|
||||
- decompression hits a full buffer condition
|
||||
3. Fixed a bug in the compressor and wrote a test for it
|
||||
4. Executed full cycle for 4.4 GB file; performance can be improved ;-)
|
||||
|
||||
### 2015-01-11
|
||||
|
||||
- Release v0.2 because of the working LZMA encoder and decoder
|
Reference in New Issue
Block a user