commit d1fe6fd1c0930d88da90f23f6d5fdb6ceaf6b0a9 [revision 2829] Author: Henrik Gramner Date: Mon May 15 00:18:36 2017 +0200 x86: Only enable AVX-512 in 8-bit mode commit 6151882671b6f9e1ceec2cdb76dd1123c8dc766f [revision 2828] Author: Henrik Gramner Date: Fri May 12 00:43:43 2017 +0200 x86: AVX-512 cabac_block_residual commit 4579616543f2e701ee9510f5eb57e31a3ef99e10 [revision 2827] Author: Henrik Gramner Date: Wed May 10 18:36:59 2017 +0200 x86: AVX-512 pixel_sad_x3 and pixel_sad_x4 Covers all variants: 4x4, 4x8, 8x4, 8x8, 8x16, 16x8, and 16x16. commit 993eb2079e45619098241e14806fc70030968af6 [revision 2826] Author: Henrik Gramner Date: Sun May 7 23:35:49 2017 +0200 x86: AVX-512 pixel_sad Covers all variants: 4x4, 4x8, 4x16, 8x4, 8x8, 8x16, 16x8, and 16x16. commit 2463174c119cef4f7e6a36a1151054fbb268b082 [revision 2825] Author: Henrik Gramner Date: Thu May 4 21:53:28 2017 +0200 x86: AVX-512 decimate_score Also drop the MMX versions and improve the SSE2, SSSE3 and AVX2 versions. commit 49fb50a67cc41e4bed2dd66f7beed12797249cd9 [revision 2824] Author: Henrik Gramner Date: Mon May 1 14:55:45 2017 +0200 x86: AVX-512 pixel_var2_8x8 and 8x16 commit 92c074e27f6bfccee42b41c183203b7b2763a94d [revision 2823] Author: Henrik Gramner Date: Mon May 1 14:54:32 2017 +0200 Rework pixel_var2 The functions are only ever called with pointers to fenc and fdec and the strides are always constant so there's no point in having them as parameters. Cover both the U and V planes in a single function call. This is more efficient with SIMD, especially with the wider vectors provided by AVX2 and AVX-512, even when accounting for losing the possibility of early termination. Drop the MMX and XOP implementations, update the rest of the x86 assembly to match the new behavior. Also enable high bit-depth in the AVX2 version. Comment out the ARM, AARCH64, and MIPS MSA assembly for now. commit 4c48f9e751e969188d606eb15aeada7f652c9db9 [revision 2822] Author: Henrik Gramner Date: Sat Apr 29 14:26:40 2017 +0200 x86: AVX-512 pixel_var_8x8, 8x16, and 16x16 Make the SSE2, AVX, and AVX2 versions a bit faster. Drop the MMX and XOP versions. commit 1cf7baa462ca52de7f07d6e4c795853900bb50bb [revision 2821] Author: Henrik Gramner Date: Fri Apr 28 21:35:25 2017 +0200 x86: AVX-512 pixel_sa8d_8x8 commit 386050088a66aa66bcaebb9b6b4b0a2b6af76a73 [revision 2820] Author: Henrik Gramner Date: Thu Apr 13 23:56:04 2017 +0200 x86: AVX-512 pixel_satd Covers all variants: 4x4, 4x8, 4x16, 8x4, 8x8, 8x16, 16x8, and 16x16. commit 2eceefe89fea91bbc7d5af2a1b4a9047d8da7805 [revision 2819] Author: Henrik Gramner Date: Wed Apr 19 16:39:48 2017 +0200 x86: AVX-512 deblock_strength Also drop the MMX version and make some slight improvements to the SSE2, SSSE3, AVX, and AVX2 versions. commit 3081ffa1c540d1df05123e0fab1937985573ac78 [revision 2818] Author: Henrik Gramner Date: Wed Apr 12 16:21:09 2017 +0200 x86: AVX-512 plane_copy_deinterleave_v210 commit 95dc64c4efdf16404e58be9ff9da4e0acaa1a4b2 [revision 2817] Author: Henrik Gramner Date: Sun Apr 9 20:34:28 2017 +0200 x86: AVX-512 memzero_aligned Reorder some elements in the x264_t.mb.pic struct to reduce the amount of padding required. Also drop the MMX implementation in favor of SSE. commit c0cd7650cb65164d183d8f77d0697b7569a52917 [revision 2816] Author: Henrik Gramner Date: Fri Apr 7 21:34:40 2017 +0200 x86: AVX and AVX-512 memcpy_aligned Reorder some elements in the x264_mb_analysis_list_t struct to reduce the amount of padding required. Also drop the MMX implementation in favor of SSE. commit f29fbc6fd23e9bc2d800eb1246e8fa19a203b831 [revision 2815] Author: Henrik Gramner Date: Thu Apr 6 16:06:34 2017 +0200 x86: AVX-512 dequant_8x8_flat16 commit 40aca29a164d5e5e6589d507bdcae6717d72f6bf [revision 2814] Author: Henrik Gramner Date: Tue Apr 4 20:54:12 2017 +0200 x86: AVX-512 dequant_8x8 commit 74f7802bb7bd301299f8229a0552a7caf2b55434 [revision 2813] Author: Henrik Gramner Date: Tue Apr 4 20:01:26 2017 +0200 x86: AVX-512 dequant_4x4 commit 3451ba3af49e58a720277615df3d8e4a4171986f [revision 2812] Author: Henrik Gramner Date: Tue Mar 28 22:59:56 2017 +0200 x86: AVX-512 mbtree_propagate_cost Also make the AVX and AVX2 implementations slightly faster. commit 75f6f9b228c3498b8c9b0d97fc925c0a7e6e6f43 [revision 2811] Author: Henrik Gramner Date: Mon Mar 27 18:19:53 2017 +0200 x86: AVX-512 coeff_last commit c3a1d1d892a79bc460c7fc192b0bf7a32c2ce0b2 [revision 2810] Author: Henrik Gramner Date: Sun Mar 26 18:29:37 2017 +0200 x86: AVX-512 zigzag_interleave_8x8_cavlc commit edb22f57ba03718c1cb9781ba005aec20a1e50e0 [revision 2809] Author: Henrik Gramner Date: Sun Mar 26 11:34:18 2017 +0200 x86: AVX-512 zigzag_scan_8x8_field commit 77b9a818fc622d0cdaa96aeb37339fbd5b1ef857 [revision 2808] Author: Henrik Gramner Date: Sat Mar 25 22:13:22 2017 +0100 x86: AVX-512 zigzag_scan_4x4_field commit 724a577237f27cdb0c0fd18ef8ed32d39430796b [revision 2807] Author: Henrik Gramner Date: Sat Mar 25 19:14:28 2017 +0100 x86: AVX-512 zigzag_scan_8x8_frame The vperm* instructions ignores unused bits, so we can pack the permutation indices together to save cache and just use a shift to get the right values. commit 2b2f039512bde7c097280255c6376cf9a901e08e [revision 2806] Author: Henrik Gramner Date: Sat Mar 25 19:14:22 2017 +0100 x86: AVX-512 zigzag_scan_4x4_frame commit 1878c7f2af0a9c73e291488209109782c428cfcf [revision 2805] Author: Henrik Gramner Date: Fri May 12 00:03:10 2017 +0200 checkasm: x86: More accurate ymm/zmm measurements YMM and ZMM registers on x86 are turned off to save power when they haven't been used for some period of time. When they are used there will be a "warmup" period during which performance will be reduced and inconsistent which is problematic when trying to benchmark individual functions. Periodically issue "dummy" instructions that uses those registers to prevent them from being powered down. The end result is more consitent benchmark results. commit 472ce3648aea3ddc16b7716eb114f4bcdb8fea8f [revision 2804] Author: Henrik Gramner Date: Sat Mar 25 10:16:09 2017 +0100 x86: AVX-512 support AVX-512 consists of a plethora of different extensions, but in order to keep things a bit more manageable we group together the following extensions under a single baseline cpu flag which should cover SKL-X and future CPUs: * AVX-512 Foundation (F) * AVX-512 Conflict Detection Instructions (CD) * AVX-512 Byte and Word Instructions (BW) * AVX-512 Doubleword and Quadword Instructions (DQ) * AVX-512 Vector Length Extensions (VL) On x86-64 AVX-512 provides 16 additional vector registers, prefer using those over existing ones since it allows us to avoid using `vzeroupper` unless more than 16 vector registers are required. They also happen to be volatile on Windows which means that we don't need to save and restore existing xmm register contents unless more than 22 vector registers are required. Also take the opportunity to drop X264_CPU_CMOV and X264_CPU_SLOW_CTZ while we're breaking API by messing with the cpu flags since they weren't really used for anything. Big thanks to Intel for their support. commit d2b5f4873e2147452a723b61b14f030b2ee760a5 [revision 2803] Author: Henrik Gramner Date: Sat Mar 18 18:50:36 2017 +0100 x86: Change assembler from yasm to nasm This is required to support AVX-512. Drop `-Worphan-labels` from ASFLAGS since it's enabled by default in nasm. Also change alignmode from `k8` to `p6` since it's more similar to `amdnop` in yasm, e.g. use long nops without excessive prefixes. commit 8c2974255b01728d4eda2434cc1997c4a3ca5eff [revision 2802] Author: Henrik Gramner Date: Sat May 6 12:26:56 2017 +0200 x86: Add some additional cpuflag relations Simplifies writing assembly code that depends on available instructions. LZCNT implies SSE2 BMI1 implies AVX+LZCNT AVX2 implies BMI2 Skip printing LZCNT under CPU capabilities when BMI1 or BMI2 is available, and don't print FMA4 when FMA3 is available. commit 93bc2cbc66f0bf4616965dcd7e0eba89201c8086 [revision 2801] Author: Henrik Gramner Date: Fri Apr 14 16:16:49 2017 +0200 x86: Faster SSE2 pixel_sad_16x16 and 16x8 Also make the order of fenc/fdec arguments a bit more consistent. commit 8ae2b62462176cd731a1cb8b5bdc9a38cba0fbe4 [revision 2800] Author: Anton Mitrofanov Date: Mon May 15 00:40:52 2017 +0300 msvs/icl: Improve target host detection commit 181a920ad5d0acdc3a08418c0e9c95be4785b814 [revision 2799] Author: Alexandra Hájková Date: Sat May 13 17:14:52 2017 +0000 ppc: Optimize add8x8_idct_dc Increases speedup compared to C from 2x to 6x. commit d0b905b901c5ee5989777cf437a7f20c1fa0a794 [revision 2798] Author: Henrik Gramner Date: Sun Feb 19 10:33:16 2017 +0100 analyse: Faster min/max MV clipping Values only needs to be clipped in one direction. commit 1bde30193eb91d1bc69b00a27e6874eb88ed4eab [revision 2797] Author: Henrik Gramner Date: Thu Feb 16 20:04:10 2017 +0100 slicetype_mb_cost: Clip MVs based on MV range Improves cost calculations, especially when a short MV range is used. commit dcf406978b9dda5c2b8aab80af5c1c47c78efd92 [revision 2796] Author: Henrik Gramner Date: Sun Jan 29 21:38:43 2017 +0100 Support YUYV and UYVY packed 4:2:2 raw input Packed YUV is arguably more common than planar YUV when dealing with raw 4:2:2 content. We can utilize the existing plane_copy_deinterleave() functions with some additional minor constraints (we cannot assume any particular alignment or overread the input buffer). Enables assembly optimizations on x86. commit aaa9aa83a111ed6f1db253d5afa91c5fc844583f [revision 2795] Author: Henrik Gramner Date: Thu Apr 20 21:58:23 2017 +0200 x86: Utilize 3-arg instructions in AVX deblock Avoids some redundant register-register moves. commit a52d41c4d135c79373a86c3a82dcc2ec3f88b025 [revision 2794] Author: Martin Storsjö Date: Fri Mar 24 11:33:46 2017 +0200 configure: Support targeting ARM with MSVC tools Set up the right gas-preprocessor as assembler frontend in these cases, using armasm as actual assembler. Don't try to add the -mcpu -mfpu options in this case. Check whether the compiler actually supports inline assembly. Check for the ARMv7 features in a different way for the MSVC compiler. commit b22a5db3c481b10b4a6ec190978d97b377750a12 [revision 2793] Author: Martin Storsjö Date: Fri Mar 24 11:33:45 2017 +0200 configure: Check for -lshell32 before forcibly adding it into LDFLAGSCLI When targeting the Windows Phone API subset, there is no shell32.lib. When targeting Windows Phone/RT, the CLI itself won't be built, but LDFLAGSCLI are included in all later cases of cc_check within configure. Therefore only add -lshell32 there if it actually is usable. commit 0aed59e74808f1cd22ee47c055a8eb4f367b2f55 [revision 2792] Author: Martin Storsjö Date: Thu May 4 22:00:51 2017 +0300 arm: Always unconditionally declare .arch armv7-a We already unconditionally declare .fpu neon and try to build all the neon codepaths (but only execute them conditionally based on a runtime check). This fixes builds targeting armv6, where the rbit instruction isn't available. This instruction is only used within a neon function in any case, so there's little point in emulating it. commit 196d7676c8f40b7c1f8f2f4af64e09ebf4c9816b [revision 2791] Author: Martin Storsjö Date: Fri Mar 24 11:33:44 2017 +0200 arm: Use .section .rodata for non-elf, non-mach platforms as well If targeting windows with armasm, gas-preprocessor can rewrite the .section .rodata into the right construct for that platform. commit 9bffbabfecf0bda066362a1b76b62c5085257e18 [revision 2790] Author: Martin Storsjö Date: Fri Mar 24 11:33:41 2017 +0200 gas-preprocessor: Support conversion of additional arm instructions into thumb Convert muls into mul+cmp. Convert "and r0, sp, #xx" into "mov r0, sp", "and r0, r0, #xx". Convert ldr with a too large shift into add+ldr. This only works in the special case when the base register is the same as the target for the ldr. commit 2e9bd88f27ed8f5f058e7e220070b7a15965cb8e [revision 2789] Author: Martin Storsjö Date: Fri Mar 24 11:33:40 2017 +0200 arm: Explicitly declare using the .text segment in the function macro This fixes one issue in building with MS armasm via gas-preprocessor. Without the .text segment specification, the object files assembled fine, but linking failed. (armasm source files don't get the text/code segment implied automatically if nothing is specified.) commit 64843af913e76fd7fb590e9227f678add96e8a3c [revision 2788] Author: Martin Storsjö Date: Fri Mar 24 11:33:39 2017 +0200 osdep: Use the EXPAND macro on other cases of ALIGNED_ARRAY_EMU EXPAND is already used on the other cases where ALIGNED_ARRAY_EMU is used on all platforms (originally needed for ICL, later also required by MSVC); apply the same change (originally from 21ba91ae) for the cases that only are used on ARM. This fixes use of ALIGNED_ARRAY_16 with MSVC when targeting ARM. commit 757091fe3abd0af0f45d11f52b652f0be2fb76f5 [revision 2787] Author: Martin Storsjö Date: Fri Mar 24 11:33:38 2017 +0200 Update to the latest version of gas-preprocessor.pl From http://git.libav.org/?p=gas-preprocessor.git This update contains changes from myself only. commit d13705191cdcbcd10d87524dbb0c26ba998d8dcc [revision 2786] Author: Martin Storsjö Date: Fri Mar 24 11:33:37 2017 +0200 arm: Skip using gas-preprocessor for iOS on arm as well The few constructs that differ can easily be handled within the source itself - tested to be working since at least Xcode 6. commit 3a3cfe32416efa4f966c0586411148236e4703c1 [revision 2785] Author: Martin Storsjö Date: Fri Mar 24 11:33:36 2017 +0200 arm: Use const macros in arm assembly where applicable This unifies the source code style, and allows building the code with clang without gas-preprocessor. commit 1e92821c5a52c80ca4d1a9b6d038bec84be48b0a [revision 2784] Author: Martin Storsjö Date: Fri Mar 24 11:33:35 2017 +0200 arm: Use commas between all macro arguments in arm assembly The clang built-in assembler requires proper commas between all macro arguments. As long as gas-preprocessor is used when building with clang, this isn't an issue. commit a84e6a486b991bffb2cc9f86b6e236978d251d2c [revision 2783] Author: Martin Storsjö Date: Fri Mar 24 11:33:34 2017 +0200 aarch64: Skip invoking gas-preprocessor for iOS Clang can handle all the constructs used there these days, working since Xcode 6 at least. commit 535fd2ec9985b9874d6ed23904404d0d2f5d40d6 [revision 2782] Author: Martin Storsjö Date: Fri Mar 24 11:33:33 2017 +0200 aarch64: Use the const macro in the aarch64 checkasm assembly source This fixes building the source with clang for iOS without gas-preprocessor. commit bec87ba69421572282e473cf8f2e11c77285ed88 [revision 2781] Author: Henrik Gramner Date: Wed Apr 12 23:26:32 2017 +0200 Windows: Add support for MSVC compilation with WSL In Windows 10 version 1703 (Creators Update) WSL supports calling native Windows binaries from the Bash shell, but it requires using full file names including extension, e.g. `cl.exe` instead of `cl`. We also don't have access to `cygpath`, so use a simple regex for converting the dependencies to Unix paths that `make` can understand. commit 43e9a6157752c2a3c2cc6c6a7fa13da72033d1dd [revision 2780] Author: Henrik Gramner Date: Sun Jan 29 22:58:24 2017 +0100 cli: Improve the --fullhelp raw demuxer input-csp listing Use the same logic for indentation as the lavf demuxer. commit 3538df12688fc4408f585c4e65ee92d5a4737b2c [revision 2779] Author: Anton Mitrofanov Date: Sat May 20 21:17:59 2017 +0300 x86inc: Remove argument from WIN64_RESTORE_XMM The use of rsp was pretty much hardcoded there and probably didn't work otherwise with stack_size > 0. commit e4b0974a4ea3a727f6cc8941e9accf7ef3ba0637 [revision 2778] Author: Henrik Gramner Date: Sat Apr 22 20:30:35 2017 +0200 x86inc: Prefer r14/r15 over r12/r13 on x86-64 Due to a peculiarity in the ModR/M addressing encoding, the r12 and r13 registers sometimes requires an additional byte when used as a base register. r14 and r15 doesn't have that issue, so prefer using them. commit 46a489b5e21cae3b4fea5d41cc285dcaf79d19e3 [revision 2777] Author: Henrik Gramner Date: Thu Apr 20 19:16:51 2017 +0200 x86inc: Make REP_RET identical to RET in SSSE3+ functions There's no point in emitting a rep prefix before ret on modern CPUs. commit 50a9dd78263191474c948d53e837348abd0bf316 [revision 2776] Author: Henrik Gramner Date: Wed Mar 29 16:43:57 2017 +0200 x86inc: Fix call with memory operands We overload the `call` instruction with a macro, but it would misbehave when the macro argument wasn't a valid identifier. Fix it by explicitly checking if the argument is an identifier. commit d13b4c3a9574cd2fbd5407c7dfc58eeff72d2080 [revision 2775] Author: Henrik Gramner Date: Sun Jan 29 16:41:33 2017 +0100 osdep: Rework alignment macros Drop ALIGNED_N and ALIGNED_ARRAY_N in favor of using explicit alignment. This will allow us to increase the native alignment without unnecessarily increasing the alignment of everything that's currently 32-byte aligned. commit 5840e200a0f1869a0596c5ed75c76f4d3221dd68 [revision 2774] Author: Vittorio Giovara Date: Mon Jan 30 22:14:57 2017 +0100 Move cabac_block_residual function declarations commit a2d2621cc5741414b1f1adfbc08f19f1cc763847 [revision 2773] Author: Vittorio Giovara Date: Mon Jan 30 22:14:59 2017 +0100 Recursively delete conftest files On OS X, one of the conftest files might be a directory named `conftest.dSYM`. commit 988ce459433fd3f978d632e8fc0ef9c19c94a6a1 [revision 2772] Author: Vittorio Giovara Date: Mon Jan 30 22:14:56 2017 +0100 Drop unused function declarations commit fb3f97833cbe3305eb613633e604f424d6d2d096 [revision 2771] Author: Vittorio Giovara Date: Fri Jan 27 18:06:39 2017 +0100 x86: Adjust cache64_ssse3 function suffixes Makes those function names more consistent with other similar functions. commit a77f3917cc6ba5e1d3c20ca649d4114217976d53 [revision 2770] Author: Vittorio Giovara Date: Fri Jan 27 16:21:16 2017 +0100 mc: Mark a function only used within the file as static commit 0ca36bfa3d2bf272da88b1df5abfc0406662989a [revision 2769] Author: Vittorio Giovara Date: Fri Jan 27 16:21:15 2017 +0100 ppc: Drop two unused static functions commit d32d7bf1c6923a42cbd5ac2fd540ecbb009ba681 [revision 2768] Author: Henrik Gramner Date: Fri May 19 16:08:34 2017 +0200 cli: Verify that yuv/y4m input has at least one frame of data Prevents a SIGBUS crash caused by attempting to access a memory-mapped region beyond the end of the input file. commit 959e869c20ea151917695930d9ad0a7a9a2f90c5 [revision 2767] Author: Kaustubh Raste Date: Fri Apr 14 15:29:31 2017 +0530 mips: Fix out-of-tree build Signed-off-by: Kaustubh Raste commit d6eb2c9630d40a2765d5092f87637f4e4d084ed1 [revision 2766] Author: Henrik Gramner Date: Sat Mar 25 00:02:11 2017 +0100 checkasm: Fix load_deinterleave_chroma_fdec test The function only writes to parts of the destination buffer but the test verifies the content of the entire buffer. The problem is that some earlier IDCT functions clobbers the same part of the buffer with garbage when benchmarked which would incorrectly cause test failures. Fix this by explicitly zeroing the buffers beforehand. commit a472b60daae0cac17d91ddf62ad4f474ded63e5b [revision 2765] Author: Henrik Gramner Date: Fri Mar 24 22:27:42 2017 +0100 checkasm: Fix compilation on hardened x86-64 ELF systems Normal PC-relative relocations cannot be used for resolving the address of external symbols on systems where ASLR results in the offset being larger than 32 bits. We are required to to go through the PLT instead. commit 469ad705b1064207b6b1068d1e25a0a591021007 [revision 2764] Author: Martin Storsjö Date: Thu Mar 23 15:05:38 2017 +0200 aarch64: Fix building checkasm for iOS On iOS, symbols are prefixed - this prefix gets added by the X() macro. commit 93340ca300e7ce66f49e41b7c2ef4a0492a7e57c [revision 2763] Author: Martin Storsjö Date: Thu Mar 23 15:05:37 2017 +0200 configure: Always enable PIC in aarch64 assembly for apple platforms This is similar to what we do for 32-bit ARM assembly as well. Fixes linker errors such as `ld: Absolute addressing not allowed in arm64 code but used in '_x264_cabac_encode_terminal_asm' referencing '_x264_cabac_range_lps' for architecture arm64`. commit 90a61ec76424778c050524f682a33f115024be96 [revision 2762] Author: Alexandra Hájková Date: Mon Dec 5 10:28:53 2016 +0000 ppc: AltiVec plane_copy_deinterleave commit bd6b66dbf9fcf67b7ebb23e4e9249083191fb984 [revision 2761] Author: Alexandra Hájková Date: Mon Jan 2 12:56:48 2017 +0000 ppc: AltiVec plane_copy_deinterleave_v210 commit 00f1670087db1b025a8088289de8938bf88a0d8b [revision 2760] Author: Alexandra Hájková Date: Wed Dec 7 19:48:02 2016 +0000 ppc: AltiVec plane_copy_deinterleave_rgb Also add some missing vector types in ppccommon.h commit 5e1ed367d725f895eeadf358861ab52521a420d3 [revision 2759] Author: Vittorio Giovara Date: Thu Jan 19 17:43:57 2017 +0100 ppc: Adjust AltiVec function suffix Architecture should always be the last element. commit 28ebb95d92278069b80ee729eb1884fe0981c6ae [revision 2758] Author: Vittorio Giovara Date: Mon Jan 9 22:28:20 2017 +0100 Move the x264_mdate() declaration to the appropriate header commit 1d2420981aa004f051a0869c005776084f7d2a44 [revision 2757] Author: Vittorio Giovara Date: Tue Jan 17 17:04:19 2017 +0100 arm/aarch64: Correctly prefix integral function symbols commit 4c4c495d58dbdea46a23947e4f202fc3b82fb891 [revision 2756] Author: Anton Mitrofanov Date: Fri Jan 13 14:57:51 2017 +0100 x86: Avoid using hardcoded function symbol prefixes commit 2524fc3164d9f00b393d4254d2c5ea8f3b9d43b0 [revision 2755] Author: Henrik Gramner Date: Wed Jan 18 21:57:14 2017 +0100 x86: AVX2 high bit-depth load_deinterleave_chroma load_deinterleave_chroma_fenc: 50% faster than AVX load_deinterleave_chroma_fdec: 25% faster than AVX commit cce50082129d3c92bd41bc0afc5a8c8d93084c9c [revision 2754] Author: Henrik Gramner Date: Wed Jan 18 21:46:55 2017 +0100 x86: AVX2 load_deinterleave_chroma_fenc 20% faster than SSSE3. commit c22c10ddb21e9f5af1da83d37122e6f7388e1342 [revision 2753] Author: Henrik Gramner Date: Tue Jan 17 21:59:47 2017 +0100 x86: AVX2 plane_copy_deinterleave 50% faster than SSSE3 in 8-bit. 25% faster than AVX in high bit-depth. Also drop the MMX versions of deinterleave functions in favor of SSE2. commit f4890275ca6523dfe5b4ae60279ae8597d9dbd4b [revision 2752] Author: Henrik Gramner Date: Thu Jan 12 22:16:53 2017 +0100 x86: AVX2 plane_copy_deinterleave_rgb Around 15% faster than SSSE3. commit da71b556730c8eb6c12a0d6950a221a4e4a99ca6 [revision 2751] Author: Henrik Gramner Date: Thu Jan 12 21:36:28 2017 +0100 x86: Faster plane_copy_deinterleave_rgb_sse2 50% faster than the previous SSE2 function. commit 3c7bf52c5b0a849458a45b5628ed1cc4b898da5f [revision 2750] Author: Henrik Gramner Date: Sun Jan 15 14:52:29 2017 +0100 x86util: Reduce code size of high bit-depth AVX LOAD_DIFF AVX supports unaligned memory operands which makes the SATD code a bit denser. commit c7a2e327bebd2b863c2620b6962fa18ab681e5dd [revision 2749] Author: Henrik Gramner Date: Sun Jan 1 19:10:10 2017 +0100 Bump dates to 2017 commit 97eaef2ab82a46d13ea5e00270712d6475fbe42b [revision 2748] Author: Alexandra Hájková Date: Sat Jan 21 12:34:49 2017 +0000 ppc: Fix the pre-VSX vec_vsx_st() fallback macro It would previously only work correctly with 8-bit data types. Fixes compilation with --disable-vsx. commit 2ebe09a4f583d108c6ec1caf70b2a7a289a8820d [revision 2747] Author: Alexandra Hájková Date: Wed Jan 18 09:13:39 2017 +0000 Fix plane_copy_deinterleave_v210 on big-endian commit 79288d90471e246584d19054bdb5381982114126 [revision 2746] Author: Alexandra Hájková Date: Wed Dec 21 13:13:43 2016 +0000 ppc: Avoid instantiating unused plane_copy functions Those functions are currently only used in 8-bit mode and results in warnings in other bit depths. commit 2ebdb90bd32c3d1618b1c5b360bff750b82b1d0b [revision 2745] Author: Martin Storsjö Date: Tue Dec 27 00:22:48 2016 +0200 arm: Load mb_y properly in mbtree_propagate_list_internal_neon The previous version, attempting to load two stack parameters at once, only would have worked if they were interpreted and loaded as 32 bit elements, not when loading them as 16 bit elements. commit b97ae0644f16bad2e2c9c9181264a946769a0aa0 [revision 2744] Author: Anton Mitrofanov Date: Mon Oct 31 14:39:52 2016 +0300 analyse: Fix lambda table values commit b2b39dae0bd891c8d150b4f4c3a2a24d8d6c1431 [revision 2743] Author: Anton Mitrofanov Date: Sat Nov 26 15:30:58 2016 +0300 Cosmetics Also make x264_weighted_reference_duplicate() static. commit 9c82d2b65534e477c972b811a4dd5004d0dd262e [revision 2742] Author: Alexandra Hájková Date: Mon Nov 28 14:04:10 2016 +0000 ppc: AltiVec store_interleave_chroma commit ea1fee272b20e1bcff2a862ea9a29e151c9136a9 [revision 2741] Author: Alexandra Hájková Date: Mon Nov 28 10:51:54 2016 +0000 ppc: AltiVec plane_copy_interleave commit 42348a8e664b091203a05d3e15555b5085afcac1 [revision 2740] Author: Alexandra Hájková Date: Sat Nov 26 20:03:34 2016 +0000 ppc: AltiVec plane_copy_swap commit 2610019af8bfb8e71f813cd2188b9eccbc287c59 [revision 2739] Author: Alexandra Hájková Date: Wed Nov 23 20:53:51 2016 +0100 ppc: AltiVec zigzag_interleave_8x8_cavlc commit 25e4e06fe8151f627a953fbd2bd39302436bf689 [revision 2738] Author: Alexandra Hájková Date: Wed Nov 23 20:53:50 2016 +0100 ppc: AltiVec zigzag_scan_8x8_frame commit 99863c665a6d4ec58b7fcc4a8a791e9c8f35a86e [revision 2737] Author: Alexandra Hájková Date: Mon Nov 14 15:06:06 2016 +0100 ppc: AltiVec sub8x8_dct_dc commit 42cb0a6813714b5380e23871a155e3820846d991 [revision 2736] Author: Alexandra Hájková Date: Mon Nov 14 15:06:05 2016 +0100 ppc: AltiVec add8x8_idct_dc commit 983acc911543453449a65bd02bbdff4c8cfe8e6a [revision 2735] Author: Martin Storsjö Date: Wed Nov 16 10:57:31 2016 +0200 checkasm: aarch64: Add filler args to make sure all parameters are passed on the stack This, combined with clobbering the stack space prior to the call, increases the chances of finding cases where 32 bit parameters are erroneously treated as 64 bit. commit 8ada354c9b5d72356c34c9ae3f787a6df4d61506 [revision 2734] Author: Martin Storsjö Date: Wed Nov 16 10:57:30 2016 +0200 checkasm: aarch64: Clobber the stack before calling functions commit 62d604ac6dddbf553c1ff2432d899b61cc50d95a [revision 2733] Author: Alexandra Hájková Date: Tue Nov 1 23:16:17 2016 +0100 ppc: Use vec_vsx_ld instead of VEC_LOAD/STORE macros Remove VEC_LOAD*, some of VEC_STORE* macros, some PREP* macros and VEC_DIFF_H_OFFSET macro. Make sure the functions do not use deprected primitives. commit 16142d8ee2a974060ecbad0f495b5a5c6516a75e [revision 2732] Author: Luca Barbato Date: Tue Nov 1 23:16:16 2016 +0100 ppc: Provide fallbacks for older architectures commit 2b741f81e51f92d053d87a49f59ff1026553a0f6 [revision 2731] Author: Luca Barbato Date: Tue Nov 1 23:16:14 2016 +0100 ppc: Add VSX support to configure commit 1f7518182e3204cb14e87baffb0150a848167ddc [revision 2730] Author: Luca Barbato Date: Tue Nov 1 23:16:13 2016 +0100 ppc: Manually unroll the horizontal prediction loop Doubles the speedup from the function (from being slower to be over twice as fast than C). commit 0706ddb1df88d716cf73decba4d82b953011760c [revision 2729] Author: Henrik Gramner Date: Sat Oct 8 17:20:18 2016 +0200 x86inc: Avoid using eax/rax for storing the stack pointer When allocating stack space with an alignment requirement that is larger than the current stack alignment we need to store a copy of the original stack pointer in order to be able to restore it later. If we chose to use another register for this purpose we should not pick eax/rax since it can be overwritten as a return value. commit 4d5c8b01a48f72f9c40651e92c39294326a0863f [revision 2728] Author: Henrik Gramner Date: Thu Dec 1 16:05:16 2016 +0100 Show the correct settings for --preset slow in --fullhelp The slow preset was recently adjusted but we forgot to update the corresponding --fullhelp message to reflect the change. commit c996ed202e2d17d1d8ae42c42d0707e51c29bb93 [revision 2727] Author: Martin Storsjö Date: Mon Nov 14 23:54:51 2016 +0200 checkasm: arm/aarch64: Fix the amount of space reserved for stack parameters Even if MAX_ARGS - 2 (for arm) or MAX_ARGS - 6 (for aarch64) parameters are passed on the stack to checkasm_checked_call, we actually only need to store MAX_ARGS - 4 (for arm) or MAX_ARGS - 8 (for aarch64) parameters on the stack when calling the tested function. commit cd15b354a887943d525e6fd8096ad4b75692d2b2 [revision 2726] Author: Janne Grunau Date: Mon Nov 14 23:54:50 2016 +0200 checkasm: arm: preserve the stack alignment in x264_checkasm_checked_call The stack used by x264_checkasm_checked_call_neon was a multiple of 4 when the checked function is called. AAPCS requires a double word (8 byte) aligned stack public interfaces. Since both calls are public interfaces the stack is misaligned when the checked is called. This can cause issues if code called within this (which includes the C implementations) relies on the stack alignment. commit 834e1b11e174f2694a4c81b4922c0c5f8778796a [revision 2725] Author: Martin Storsjö Date: Wed Nov 16 10:56:14 2016 +0200 arm: Don't use vcmp.f64 for testing for an all-zeros register On iOS, vcmp.f64 can behave as if the register was zero, if the register (interpreted as a f64), was a denormal number. The vcmp.f64 (and other VFP instructions) will trap to the kernel (which is supposed to implement the FP operation, which it apparently doesn't do properly on iOS) if the value is a denormal. If this happens, the whole comparison ends up way more costly. commit a91e95fca2222ac0731e987a07f4b11c670f4556 [revision 2724] Author: Janne Grunau Date: Wed Nov 16 10:49:14 2016 +0200 aarch64: Clear the upper half of int parameters in x264_plane_copy_core_neon commit 1eab3b402e1d7729da295024fa7eec8b09e30c20 [revision 2723] Author: Luca Barbato Date: Tue Nov 1 23:16:18 2016 +0100 ppc: Fix hadamard for little-endian Extending to 16-bit works with flipped bytes. commit 75918e1849e1286885bfcfb0c348de885a702fb3 [revision 2722] Author: Anton Mitrofanov Date: Thu Sep 22 00:17:48 2016 +0300 Correctly signal max_dec_frame_buffering with --keyint 1 According to E.2.1 it is inferred to be equal to 0 only if profile_idc is equal to 44, 86, 100, 110, 122, or 244 and constraint_set3_flag is equal to 1. commit 72d53ab2ac7af24597a824e868f2ef363a22f5d4 [revision 2721] Author: Henrik Gramner Date: Sat Sep 17 21:41:52 2016 +0200 x86: Faster pixel_ssim_4x4x2_core commit 8c07263ad9218bdc3e0f5b84d578968513885df7 [revision 2720] Author: Henrik Gramner Date: Sat Sep 17 21:14:35 2016 +0200 x86: Deduplicate a constant in hpel_filter_c commit 9521b278adb92081f052c1b7bfc4b95651d88b07 [revision 2719] Author: Henrik Gramner Date: Sat Sep 17 14:45:08 2016 +0200 x86: Faster pixel_ssd_nv12 Also drop the MMX2 version to simplify things. commit 75d0f9cc8770bc4f36785062116757d24eb44604 [revision 2718] Author: Henrik Gramner Date: Sun Sep 11 15:32:54 2016 +0200 x86: SSE zigzag_scan_4x4_field Replaces the MMX2 version, one cycle faster. Also change the checkasm test to use the correct alignment macro. commit 0ce77f9eb71051c9a6121ec12c2abaac99ee628a [revision 2717] Author: Henrik Gramner Date: Wed Sep 7 19:27:31 2016 +0200 x86: AVX2 mbtree_propagate_list SIMD part is around 25% faster than AVX on Haswell, around 7% faster when including the runtime of the scalar C wrapper. commit 0c36239a4826f6e5a3cb873aca1814e389a46e29 [revision 2716] Author: Henrik Gramner Date: Wed Sep 7 19:26:42 2016 +0200 x86: Move predict_16x16_dc_left calculations to asm 1-2 cycles faster and avoids some code duplication to decrease code size. Also drop the MMX2 implementation in favor of SSE2 to simplify things. commit 0cc8afd31212de013b26b10f58c608c9adcff2fc [revision 2715] Author: Anton Mitrofanov Date: Thu Aug 18 19:00:48 2016 +0300 avs: support for AviSynth+ high bit-depth pixel formats commit dc0fe73636d34baeb3a64918b52db64d2a9e83bb [revision 2714] Author: Janne Grunau Date: Fri Aug 26 20:26:56 2016 +0300 aarch64: implement x264_plane_copy_swap_neon plane_copy_swap_c: 27054 plane_copy_swap_neon: 4152 commit eaf2fc20c8579714a48523b7ab8c05373708a25f [revision 2713] Author: Anton Mitrofanov Date: Thu Aug 18 22:14:22 2016 +0300 Various cosmetics of semicolon use commit aae177c55141460f442de0572c4a434bf2ae20bc [revision 2712] Author: Henrik Gramner Date: Thu Jul 28 21:58:40 2016 +0200 cli: Prefetch yuv/y4m input frames on Windows 8 and newer Use PrefetchVirtualMemory() (if available) on memory-mapped input frames. Significantly improves performance when the source file is not already present in the OS page cache by asking the OS to bring in those pages from disk using large, concurrent I/O requests. Most beneficial on fast encoding settings. Up to 40% faster overall with --preset ultrafast, and up to 20% faster overall with --preset veryfast. This API was introduced in Windows 8, so call it conditionally. On older Windows systems the previous behavior remains unchanged. commit 4e5adb87070c82b937c03e0cc030eae3578c251d [revision 2711] Author: Henrik Gramner Date: Thu Jul 28 19:34:04 2016 +0200 Adjust --preset slow * Swap --me umh for --trellis 2. They have a similar effect on performance but the latter gives slightly better results in most cases. * Change --b-adapt from 2 to 1. Negligible difference in quality since the b-adapt 1 improvements, but it's significantly faster. Also remove a redundant assignment from veryfast (--me hex is set by default). commit 1e4fb55a283ba90fef346033027af851f2a04468 [revision 2710] Author: Henrik Gramner Date: Thu Jul 28 19:33:57 2016 +0200 ratecontrol_new: Simplify an expression in HRD timescale calculation Also gets rid of a false positive static analyser integer division warning. commit 17378b2028146fa54a1b2b90da62554935d9dcc2 [revision 2709] Author: Henrik Gramner Date: Thu Jul 28 19:33:44 2016 +0200 gcc: Enable __sync_fetch_and_add() on x86-64 It was previously only enabled on 32-bit x86 for no reason, so 64-bit systems had to use a mutex instead of a simple `lock xadd` instruction. Note that this code is only used in some very specific configurations involving sliced threads. commit 86b71982e131eaa70125f8d0e725fcade9c4c677 [revision 2708] Author: Anton Mitrofanov Date: Tue Sep 20 18:48:22 2016 +0300 mips: Fix high bit-depth compilation commit 1ea3c682ca12c7f13ea6f82b42bdc40afcfda87f [revision 2707] Author: Henrik Gramner Date: Sat Sep 17 15:53:59 2016 +0200 checkasm: Fix compilation on Windows with --disable-thread commit 5caef139cf7d6b41a95ee9568625d36d1ae1c107 [revision 2706] Author: Janne Grunau Date: Fri Aug 26 20:26:55 2016 +0300 arm/aarch64: use plane_copy wrapper macros Move the macros to common/mc.h to share them across all architectures. Fixes possible buffer overreads if the width of the user supplied frames is not a multiple of 16. Reported-by: Kirill Batuzov commit 3f5ed56d4105f68c01b86f94f41bb9bbefa3433b [revision 2705] Author: Henrik Gramner Date: Sun Apr 3 17:28:33 2016 +0200 configure: Support specifying a custom pkg-config commit 7c9c687d8062f72b3ec300de8997bdae8277a741 [revision 2704] Author: Anton Mitrofanov Date: Wed Jun 8 22:46:17 2016 +0300 Add support for new VUI parameters Support the new color primaries, transfer characteristics, and matrix coefficients defined in the 2016-02 edition of the H.264 specification. commit 92515e8ff73491ef8a44c85e0bee265ba5791070 [revision 2703] Author: Henrik Gramner Date: Sun Apr 24 14:10:22 2016 +0200 configure: Add link-time optimization support Enabled by using the --enable-lto configuration option. May give a slight performance improvement in some cases, but it can also reduce performance in other cases (largely compiler-dependant) so don't enable it by default. It also makes compilation (and linking in particular) a fair bit slower. Note that some older versions of GNU binutils will incorrectly warn about "memset used with constant zero length parameter" when linking using LTO. This is due to a bug in binutils and can safely be ignored. commit b6267e0ff770545de88dfb5d3f176ea73f453730 [revision 2702] Author: Henrik Gramner Date: Sun Apr 24 13:32:43 2016 +0200 configure: Fix clang detection with versioned binaries Correctly detect clang binaries that has the version number appended as a suffix to the file name, e.g. `clang38`. commit 14a58532fea2c5f9e7b93c918476d842091c4268 [revision 2701] Author: Janne Grunau Date: Sun Apr 24 14:38:56 2016 +0200 arm: Add asm for mbtree fixed point conversion 7-8 times faster on a cortex-a53 vs. gcc-5.3. mbtree_fix8_pack_c: 44114 mbtree_fix8_pack_neon: 5805 mbtree_fix8_unpack_c: 38924 mbtree_fix8_unpack_neon: 4870 commit b6f189eb4c5646483f7901293944695167e71ed9 [revision 2700] Author: Janne Grunau Date: Sun Apr 24 14:38:55 2016 +0200 aarch64: Add asm for mbtree fixed point conversion pack is ~7 times faster and unpack is ~9 times faster on a cortex-a53 compared to gcc-5.3. mbtree_fix8_pack_c: 41534 mbtree_fix8_pack_neon: 5766 mbtree_fix8_unpack_c: 44102 mbtree_fix8_unpack_neon: 4868 commit a5e06b9a435852f0125de4ecb198ad47340483fa [revision 2699] Author: Anton Mitrofanov Date: Sun May 22 22:33:58 2016 +0300 Fix p4x4 analyse for 4:4:4 encoding with chroma ME commit 07221290db0a94bda1f6ece3fdf3c02675c8adce [revision 2698] Author: Anton Mitrofanov Date: Sun May 22 22:18:34 2016 +0300 Fix 4:4:4 encoding with CQM commit 23ebc1f763936b7fcfc81e21530e1b65dbc503b9 [revision 2697] Author: Anton Mitrofanov Date: Sun May 22 19:36:05 2016 +0300 Fix p4x4 RDO with CAVLC commit 740a8c556bd9b68e899d6991f3f987a443aa14aa [revision 2696] Author: Anton Mitrofanov Date: Sat Apr 23 23:10:03 2016 +0300 Apply zone options a little bit earlier This way things like SAR changes will have full effect from the start frame. commit 928bd9d5def4f0ca5071ea176a11b816a01e6495 [revision 2695] Author: Anton Mitrofanov Date: Sat Apr 23 22:45:44 2016 +0300 Fix corruption when using encoder_reconfig() with some parameters Changing parameters that affects SPS, like --ref for example, wasn't behaving correctly previously. Probably a regression in r2373. commit 3b70645597bea052d2398005bc723212aeea6875 [revision 2694] Author: Anton Mitrofanov Date: Wed Apr 13 21:54:25 2016 +0300 Clean up header includes commit 2102de2584e03fce4abac49eb37d5d7a0803380f [revision 2693] Author: Henrik Gramner Date: Wed Apr 13 17:53:49 2016 +0200 Eliminate some compiler warnings on BSD Include in addition to . According to the POSIX specification the prototypes for strcasecmp() and strncasecmp() are declared in . On some systems they are also declared in for compatibility reasons but we shouldn't rely on that. Define _POSIX_C_SOURCE only when it's required to do so. Some BSD variants doesn't declare certain function prototypes otherwise. commit 64f4e24909924fceeea6e154d71b7dfbf586c7ea [revision 2692] Author: Henrik Gramner Date: Tue Apr 12 21:33:54 2016 +0200 osx: Add -D_DARWIN_C_SOURCE to CFLAGS OSX doesn't like _POSIX_C_SOURCE being defined when _DARWIN_C_SOURCE isn't. commit 00597d74c6223f3694e2c6614ef0574d7fca6b22 [revision 2691] Author: Anton Mitrofanov Date: Tue Apr 12 20:33:42 2016 +0300 Remove an unused parameter from x264_slicetype_frame_cost() The b_intra_penalty parameter is no longer used anywhere after the improvements to the --b-adapt 1 algorithm. commit aa26e880bc2cd04cc81c776051d5e21d03fc975a [revision 2690] Author: Anton Mitrofanov Date: Sun Apr 10 20:17:32 2016 +0300 Improve the --b-adapt 1 algorithm Roughly the same speed as before but with significantly better results, comparable to --b-adapt 2. commit 24f25b6afd21488a93bd86098f98dfaf229fc149 [revision 2689] Author: Henrik Gramner Date: Sun Apr 3 15:49:26 2016 +0200 analyse: i_sub_partition write combining commit 1507cfe80ecf5f8e240a35e9e9dc5a92bd25e792 [revision 2688] Author: Henrik Gramner Date: Tue Mar 15 20:16:45 2016 +0100 x86: Use one less register in mbtree_propagate_cost_avx2 Avoids the need to save and restore xmm6 on 64-bit Windows. commit c82c7374938f4342971adf8b2495c3a1bbe621c4 [revision 2687] Author: Henrik Gramner Date: Fri Mar 4 17:53:08 2016 +0100 x86: Add asm for mbtree fixed point conversion The QP offsets of each macroblock are stored as floats internally and converted to big-endian Q8.8 fixed point numbers when written to the 2-pass stats file, and converted back to floats when read from the stats file. Add SSSE3 and AVX2 implementations for conversions in both directions. About 8x faster than C on Haswell. commit be677efc6313ade5eddf722fdf097cce56df1344 [revision 2686] Author: Anton Mitrofanov Date: Thu Apr 7 13:09:03 2016 +0300 x86inc: Enable AVX emulation in additional cases Allows emulation to work when dst is equal to src2 as long as the instruction is commutative, e.g. `addps m0, m1, m0`. commit b5661d322866df647e6084061a471eceac214c28 [revision 2685] Author: Anton Mitrofanov Date: Thu Apr 7 12:48:29 2016 +0300 x86inc: Improve handling of %ifid with multi-token parameters The yasm/nasm preprocessor only checks the first token, which means that parameters such as `dword [rax]` are treated as identifiers, which is generally not what we want. commit 283663d4c13088f4811c78b75318bda59d696b2d [revision 2684] Author: Anton Mitrofanov Date: Mon Mar 28 18:35:38 2016 +0300 x86inc: Fix AVX emulation of some instructions commit 54fd697668d0a04246ad0b0e9955a6583b2bb8b6 [revision 2683] Author: Henrik Gramner Date: Fri Mar 4 17:51:41 2016 +0100 x86inc: Fix AVX emulation of scalar float instructions Those instructions are not commutative since they only change the first element in the vector and leave the rest unmodified. commit eeb9b66ddb0f27d8baaa8efa9597613e61140836 [revision 2682] Author: Henrik Gramner Date: Sat Feb 27 20:34:39 2016 +0100 x86: dct2x4dc asm Only used in 4:2:2. MMX2 version implemented for 8-bit, SSE2 and AVX versions implemented for high bit-depth. 2.5x faster on 32-bit and 1.6x faster on 64-bit compared to C on Ivy Bridge. commit 23d1d8e89be2d99f5c6924a6055fc80d69429503 [revision 2681] Author: Henrik Gramner Date: Sat Feb 20 20:31:22 2016 +0100 x86: SSE2/AVX idct_dequant_2x4_(dc|dconly) Only used in 4:2:2. Both 8-bit and high bit-depth implemented. Approximate performance improvement compared to C on Ivy Bridge: x86-32 x86-64 idct_dequant_2x4_dc 2.1x 1.7x idct_dequant_2x4_dconly 2.7x 2.0x Helps more on 32-bit due to the C versions being register starved. commit dbbf1dd2836a21b65178442c1fb7a00ea089d7ec [revision 2680] Author: Henrik Gramner Date: Sat Feb 20 16:53:35 2016 +0100 checkasm: Fix idct_dequant_2x4_(dc|dconly) tests They used the wrong qp values and the dconly test had the wrong name. This was undetected before because there wasn't any assembly implementations. commit 0db0ac3a05b80eee7994fab08cbce2d07e8b1586 [revision 2679] Author: Henrik Gramner Date: Sun Feb 7 14:55:26 2016 +0100 checkasm: Disable Windows Error Reporting When developing new assembly code it's expected that checkasm may crash, and the error reporting dialog popup can be somewhat annoying. commit deae1b1001d134f5babc4fad3208bd951a454951 [revision 2678] Author: Henrik Gramner Date: Sat Feb 6 18:49:46 2016 +0100 windows: Flag debug builds in the resource file commit 0082b717199bafb4abbb6638e7c30d50deaf2c1b [revision 2677] Author: Henrik Gramner Date: Thu Feb 4 20:06:57 2016 +0100 cli: Refactor filter option parsing The old code contained a whole bunch of memory leaks, unchecked mallocs, sections of dead code, etc. and was generally overly complex. Also consolidate some memory allocations into a single one. commit dfe394cadc8a39752de5b3f4a0be222c1b9290f2 [revision 2676] Author: Henrik Gramner Date: Sun Jan 31 21:50:52 2016 +0100 ffms: Various improvements * Drop the MinGW Unicode workarounds. Those were required at the time Windows Unicode support was added to x264 but the underlying problem has since been fixed in FFMS. * Use FFMS_IndexBelongsToFile() as an additional sanity check when reading an index file to ensure that it belongs to the current source video. * Upgrade to the new API to prevent deprecation warnings when compiling. * Fix a resource leak that would occur if FFMS_GetFirstTrackOfType() or FFMS_CreateVideoSource() failed. * Minor string handling adjustments related to progress reporting. This increases the FFMS version requirement from 2.16.2 to 2.21.0. commit 215afdbd8ecc924f2028f79851458076683e97ad [revision 2675] Author: Henrik Gramner Date: Mon Apr 11 16:59:46 2016 +0200 msvc: Add snprintf/vsnprintf replacements MSVC pre-VS2015 has broken snprintf/vsnprintf implementations which are incompatible with C99 and may lead to buffer overflows. commit 5be32efc244d96aa56be462664b5c56d7318e86d [revision 2674] Author: Henrik Gramner Date: Sun Jan 31 20:21:01 2016 +0100 configure: Define feature test macros for --std=gnu99 Makes the printf() family functions on MinGW use the correct C99 POSIX versions instead of the broken pre-VS2015 Microsoft ones. Also allows us to get rid of some _GNU_SOURCE and _ISOC99_SOURCE defines. commit c01bf42117b811a0469f9f6c374f4a0daa98716d [revision 2673] Author: Henrik Gramner Date: Thu Jan 28 18:37:37 2016 +0100 mingw: Enable high-entropy ASLR on 64-bit Windows To fully utilize HEASLR the image base address must also be set above 4 GiB. For consistency use the same address as MSVC uses by default. This requires binutils 2.25 which isn't available on all common distributions, so only enable it after checking that it's supported. commit dd6b7b974e0057da726f71e10c24d057a339605b [revision 2672] Author: Henrik Gramner Date: Sun Jan 24 01:48:18 2016 +0100 msvs: WinRT support To compile x264 for WinRT the following additional steps has to be performed. * Ensure that the necessary SDK is installed. * Set the correct environment variables in the VS command prompt as shown at https://trac.ffmpeg.org/wiki/CompilationGuide/WinRT * Add one of the following to --extra-cflags depending on the target OS: "-DWINAPI_FAMILY=WINAPI_FAMILY_PC_APP -D_WIN32_WINNT=0x0A00" (Windows 10) "-DWINAPI_FAMILY=WINAPI_FAMILY_PC_APP -D_WIN32_WINNT=0x0603" (Windows 8.1) commit 7650a1367003e24f4f1b831682c012b5ba3e6c69 [revision 2671] Author: Henrik Gramner Date: Sun Jan 24 23:58:40 2016 +0100 configure: Disable CLI libraries when CLI is disabled commit 1ce062abb47ac59621b402cb26a1f14c91bb52bc [revision 2670] Author: Henrik Gramner Date: Fri Feb 5 18:46:13 2016 +0100 matroska: mk_close: Check fseek() return value commit de7af9185e172122cd9b800845e1988a52ad7cc3 [revision 2669] Author: Henrik Gramner Date: Fri Feb 5 18:46:02 2016 +0100 parse_qpfile: Check ftell() and fseek() return values commit fd2c324731c2199e502ded9eff723d29c6eafe0b [revision 2668] Author: Anton Mitrofanov Date: Sun Apr 10 20:13:59 2016 +0300 Use the correct default B-ref placement with B-pyramid Cost analyse functions expects the placement of the B-ref in a sequence of an even number of B-frames to be located towards the beginning while the actual placement was towards the end. Change the placement to be consistent with the analyse expectations, e.g. PbbBbP -> PbBbbP. commit e6a3f2989dd9eba3434c21fa94a6d9a5d1c7a9fe [revision 2667] Author: Henrik Gramner Date: Fri Feb 5 18:45:47 2016 +0100 parse_zones: Fix memory leak commit f86756985d42ac4a14866534c588061ede860b7b [revision 2666] Author: Alexey Samsonov Date: Mon Jan 25 16:05:25 2016 -0800 Fix float-cast-overflow in x264_ratecontrol_end function According to the C standard, it is undefined behavior to cast a negative floating point number to an unsigned integer. Float-cast-overflow in general is known to produce different results on different architectures. Building x264 code with Clang and -fsanitize=float-cast-overflow (http://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html#availablle-checks) and running it on some real-life examples occasionally produces errors of the form: encoder/ratecontrol.c:1892: runtime error: value -5011.14 is outside the range of representable values of type 'unsigned short' Fix these errors by explicitly coding the de-facto x86 behavior: casting float to uint16_t through int16_t. commit a01e33913655f983df7a4d64b0a4178abb1eb618 [revision 2665] Author: Sebastian Dröge Date: Sun Dec 20 23:49:35 2015 +0300 Fix AVC-Intra padding for non-Annex B encoding commit 1e4a24f305c006a95fec00131703d0e0ecae3a38 [revision 2664] Author: Anton Mitrofanov Date: Mon Jan 11 21:39:22 2016 +0300 ppc: Only perform AltiVec detection if compiled with AltiVec enabled commit b5953629117adc2b8d0d0eed6eb323c00587b428 [revision 2663] Author: Anton Mitrofanov Date: Tue Oct 13 15:30:16 2015 +0300 2-pass: Take into account possible frame reordering commit 20821a26ec510979e49fcfd6becc6ad7e2d8b388 [revision 2662] Author: Anton Mitrofanov Date: Tue Oct 13 12:54:05 2015 +0300 Revise the 2-pass algorithm commit 065321c48d0d371c1735b3cc9d368b43e1b64aaa [revision 2661] Author: Anton Mitrofanov Date: Tue Jan 5 02:41:43 2016 +0300 Revise the row VBV algorithm (part 2) Should fix rare cases of VBV emergency mode activation caused by too much trust to the row predictors. commit d23d18655249944c1ca894b451e2c82c7a584c62 [revision 2660] Author: Henrik Gramner Date: Fri Jan 1 12:44:31 2016 +0100 Bump dates to 2016 commit 3d972062c8a37d1a19586e2351e889b0a70beb40 [revision 2659] Author: Henrik Gramner Date: Mon Oct 26 19:54:20 2015 +0100 cli: Use memory-mapped input frames for yuv and y4m Improves performance by avoiding extraneous memory copying. Most beneficial on fast settings. On average around 5-10% faster overall on ultrafast but the performance improvement can be even larger in some cases. commit 38a5268dbec56adea750e05c4981f3bbb176e735 [revision 2658] Author: Henrik Gramner Date: Thu Jan 7 01:59:24 2016 +0100 y4m: Support extended frame headers when seeking Use the actual length of the frame header of the first frame instead of assuming a header without extensions when calculating the frame size. Also makes the frame counter more accurate with extended frame headers. commit cc652c158c1fa65bfeafb6446b5be855850065d0 [revision 2657] Author: Henrik Gramner Date: Tue Nov 3 17:55:08 2015 +0100 configure: Simplify cygwin/mingw/msys code Avoids some code duplication. Also drop the -mno-cygwin check since that option was removed back in 2008. commit 8b2d2a6d51abf51ad38dd8705d280448fbe63aaf [revision 2656] Author: Henrik Gramner Date: Mon Oct 26 18:52:46 2015 +0100 y4m: Avoid some redundant strlen() calls commit 24f7705f15cf6d59028a76a894d866b9fad85f39 [revision 2655] Author: Henrik Gramner Date: Sun Oct 25 17:15:10 2015 +0100 Simplify threadpool_wait commit 30ba5dc22fd0ae359e144847f2636574f659627d [revision 2654] Author: Henrik Gramner Date: Fri Oct 16 19:05:34 2015 +0200 windows: Use native threads by default --disable-win32thread can be passed as an argument to configure to compile with pthreads, which was the old default behavior. commit 1637239a64f3ec9a491b91202bd37097f15a253d [revision 2653] Author: Henrik Gramner Date: Sun Oct 11 22:32:11 2015 +0200 x86: Avoid some bypass delays and false dependencies A bypass delay of 1-3 clock cycles may occur on some CPUs when transitioning between int and float domains, so try to avoid that if possible. commit 7688814a7ec994f8e5984d199b465ccc068b98af [revision 2652] Author: Henrik Gramner Date: Sun Oct 11 22:32:03 2015 +0200 x86: Enable high bit-depth x264_coeff_last64_avx2_lzcnt The function existed but was never enabled. commit 366fa85885053c7b836a4272a4fbec1852103979 [revision 2651] Author: Geza Lore Date: Mon Oct 12 13:13:42 2015 +0100 x86inc: Add debug symbols indicating sizes of compiled functions Some debuggers/profilers use this metadata to determine which function a given instruction is in; without it they get can confused by local labels (if you haven't stripped those). On the other hand, some tools are still confused even with this metadata. e.g. this fixes `gdb`, but not `perf`. Currently only implemented for ELF. commit 70c3ba42e610b4182edda4fdeb10b37a2a70eb8f [revision 2650] Author: Henrik Gramner Date: Fri Oct 16 21:28:49 2015 +0200 x86inc: Avoid creating unnecessary local labels The REP_RET workaround is only needed on old AMD cpus, and the labels clutter up the symbol table and confuse debugging/profiling tools, so use EQU to create SHN_ABS symbols instead of creating local labels. Furthermore, skip the workaround completely in functions that definitely won't run on such cpus. This patch doesn't modify any emitted instructions, and doesn't actually affect x264 at all. It's only for other projects that use x86inc.asm without an appropriate `strip` command in their buildsystem. Note that EQU is just creating a local label when using nasm instead of yasm. This is probably a bug, but at least it doesn't break anything. commit 5c3d473a966e4b013759097fb98cd4a9cb5a34f5 [revision 2649] Author: Henrik Gramner Date: Thu Oct 15 17:42:49 2015 +0200 x86inc: Simplify AUTO_REP_RET cpuflags is never undefined any more, it's set to 0 instead. Also fix an incorrect comment. commit 28d68f090c0103704f5f6a86fcf362251774cd78 [revision 2648] Author: Henrik Gramner Date: Mon Oct 12 21:55:11 2015 +0200 x86inc: Use more consistent indentation commit 963b99efaaf1f0628b155e52b8a7c102cd1d37ff [revision 2647] Author: Henrik Gramner Date: Mon Oct 12 20:15:18 2015 +0200 x86inc: Preserve arguments when allocating stack space When allocating stack space with a larger alignment than the known stack alignment a temporary register is used for storing the stack pointer. Ensure that this isn't one of the registers used for passing arguments. commit 6e5033417a53fa66d002665618a1350d7417725e [revision 2646] Author: Henrik Gramner Date: Sun Jan 17 00:25:47 2016 +0100 x86inc: Improve FMA instruction handling * Correctly handle FMA instructions with memory operands. * Print a warning if FMA instructions are used without the correct cpuflag. * Simplify the instantiation code. * Clarify documentation. Only the last operand in FMA3 instructions can be a memory operand. When converting FMA4 instructions to FMA3 instructions we can utilize the fact that multiply is a commutative operation and reorder operands if necessary to ensure that a memory operand is used only as the last operand. commit 93cba743c78959ad97812dbaf894903c608912d0 [revision 2645] Author: Henrik Gramner Date: Sun Oct 11 22:31:53 2015 +0200 x86inc: Be more verbose in assertion failures commit 8017b33454397d59b3285ec6d2ad35b6d0deb58a [revision 2644] Author: Henrik Gramner Date: Wed Sep 30 23:17:00 2015 +0200 x86inc: Make cpuflag() and notcpuflag() return 0 or 1 Makes it possible to use them in arithmetic expressions. commit 5c6570495f8f1c716b294aee1430d8766a4beb9c [revision 2643] Author: Henrik Gramner Date: Fri Oct 30 16:55:49 2015 +0100 encoder_open: Fix memory leak Furthermore, the x264_analyse_prepare_costs() and x264_analyse_init_costs() functions were only used in x264_encoder_open(), so move that entire section of code to analyse.c as well to simplify things. commit 424534537a249dcf913e02560303f6afca423489 [revision 2642] Author: Janne Grunau Date: Wed Nov 18 11:08:22 2015 +0100 arm: do not fill mc_weight*_neon tabs for HIGH_BIT_DEPTH The asm is only for 8-bit and function prototypes reflect that. Avoids numerous warnings with --bit-depth=9/10. commit df51d8efa8ce9afcedda64acc69c1dba2648716d [revision 2641] Author: Janne Grunau Date: Tue Oct 13 23:50:11 2015 +0200 arm: Eliminate text relocations in asm Android 6 does not link shared libraries with text relocations. Make the movrel macro position independent and add movrelx for indirect loads of external symbols. Move the function pointer table for the aligned memcpy variants to the data.rel.ro section on Linux/Android. commit a2fe237af1b68f2bd53d64ed3faed62429d3ee5a [revision 2640] Author: Martin Storsjö Date: Thu Oct 15 11:50:33 2015 +0300 arm: Don't assume alignment in mbtree_propagate_list_internal where it isn't provided commit 9f422c0cd9c0abcd6a7abb10b51f8be883c39b2b [revision 2639] Author: Janne Grunau Date: Tue Oct 13 23:50:12 2015 +0200 arm: Fix checkasm register clobber check on iOS r9 is a volatile register in the iOS ABI and will therefore not be preserved by compiled functions like the luma motion compensation. Add the symbol prefix to the puts() call and use blx since a switch between arm and thumb mode might be required. commit 75992107adcc8317ba2888e3957a7d56f16b5cd4 [revision 2638] Author: Anton Mitrofanov Date: Thu Oct 1 01:02:16 2015 +0300 ppc: Add detection of AltiVec support for FreeBSD Patch from FreeBSD ports. commit 479d0c1fe73833ba65e0a10f6f5cf18df6def719 [revision 2637] Author: Anton Mitrofanov Date: Mon Sep 28 21:07:55 2015 +0300 Don't assume 16-byte stack alignment by default on x86-32 Some compilers depending on target OS uses 4-byte stack alignment by default. Explicitly check known good compilers and specific options for stack alignment. commit fad44d59b3adeb29b9c92fde0b80116cde79020e [revision 2636] Author: Anton Mitrofanov Date: Tue Sep 22 21:33:07 2015 +0300 Fix a few static analyzer performance hints commit de24c8c189364013e62d58d1e8f2fef878eb62bf [revision 2635] Author: Anton Mitrofanov Date: Tue Sep 22 20:19:23 2015 +0300 Revise the row VBV algorithm commit 001d30598c75d9bbc3aa80f67f9bdac17692437d [revision 2634] Author: Anton Mitrofanov Date: Tue Sep 22 19:26:25 2015 +0300 Fix high bit depth lookahead cost compensation algorithm Now high bit depth VBV should act more like 8-bit depth one. commit 91368390db9179226b5b4ed718a5788b754f9302 [revision 2633] Author: Anton Mitrofanov Date: Tue Sep 22 19:05:52 2015 +0300 Correctly update the intra row predictor in B-frames It was previously used but never updated from it's initialization value. commit e0d722f85f8599e324be2bebef9430155b25c329 [revision 2632] Author: Anton Mitrofanov Date: Tue Sep 22 18:58:24 2015 +0300 Change the predictors update algorithm Keep predictor offsets more stable. This should fix VBV misprediction in frames with a large difference in complexity between the top and bottom parts. commit 6f04b146875c45e6f7845a7bb5fb7fdf8e7534f1 [revision 2631] Author: Martin Storsjö Date: Thu Sep 3 09:30:44 2015 +0300 arm: Implement x264_mbtree_propagate_{cost, list}_neon The cost function could be simplified to avoid having to clobber q4/q5, but this requires reordering instructions which increase the total runtime. checkasm timing Cortex-A7 A8 A9 mbtree_propagate_cost_c 63702 155835 62829 mbtree_propagate_cost_neon 17199 10454 11106 mbtree_propagate_list_c 104203 108949 84532 mbtree_propagate_list_neon 82035 78348 60410 commit 3e25eab0b7172e3c0b067b8b6d641ce148d03db9 [revision 2630] Author: Martin Storsjö Date: Thu Sep 3 09:30:43 2015 +0300 x86: Share the mbtree_propagate_list macro with aarch64 This avoids having to duplicate the same code for all architectures that implement only the internal part of this function in assembler. commit 654901dfca73a21e2bb2366dda79eb413e9bfb66 [revision 2629] Author: Martin Storsjö Date: Wed Sep 2 22:39:51 2015 +0300 arm: Implement luma intra deblocking checkasm timing Cortex-A7 A8 A9 deblock_luma_intra[0]_c 5988 4653 4316 deblock_luma_intra[0]_neon 3103 2170 2128 deblock_luma_intra[1]_c 7119 5905 5347 deblock_luma_intra[1]_neon 2068 1381 1412 This includes extra optimizations by Janne Grunau. Timings from a separate build, on Exynos 5422: Cortex-A7 A15 deblock_luma_intra[0]_c 6627 3300 deblock_luma_intra[0]_neon 3059 1128 deblock_luma_intra[1]_c 7314 4128 deblock_luma_intra[1]_neon 2038 720 commit e2696a60a3e58d92e88e149b63c0b06a066eea9e [revision 2628] Author: Martin Storsjö Date: Mon Aug 31 22:40:31 2015 +0300 arm: Implement some neon 8x16c intra predict functions checkasm timing Cortex-A7 A8 A9 intra_predict_8x16c_dct_c 862 540 590 intra_predict_8x16c_dct_neon 608 511 657 intra_predict_8x16c_h_c 972 707 719 intra_predict_8x16c_h_neon 722 656 672 intra_predict_8x16c_p_c 10183 9819 8655 intra_predict_8x16c_p_neon 2622 1972 1983 commit 5db8b6b93aa91079ab785b9b49413625430536fd [revision 2627] Author: Martin Storsjö Date: Fri Aug 28 00:15:01 2015 +0300 arm: Implement x264_plane_copy_neon checkasm timing Cortex-A7 A8 A9 plane_copy_c 13124 10925 9106 plane_copy_neon 7349 5103 8945 commit 35d32d09e163bb0f2ce60a8e13f9f22125445346 [revision 2626] Author: Martin Storsjö Date: Fri Aug 28 09:40:24 2015 +0300 checkasm: arm: Check register clobbering Cast the function pointer to a different type signature, to be able to use uint64_t as return type (instead of intptr_t) for those calls that require it. Use two separate functions, depending on whether neon is available. commit 9cbdb635a4bd78e6767e735a062c0d9a5766b849 [revision 2625] Author: Martin Storsjö Date: Fri Aug 14 00:00:57 2015 +0300 checkasm: Try different widths for ssd_nv12 To test all codepaths in the aarch64 neon implementation, one at the very least needs to test with width 8, 16, 24 and 32. commit 39af8c72e618a544baa06ae427fb2b440861abcd [revision 2624] Author: Jerome Duval Date: Fri Jun 13 19:56:27 2014 +0000 Haiku support Add Haiku as supported platform in configure. Haiku has no nice() function, use the platform specific substitute instead. commit 59683a97b50b34c6282457a959bb6b3e9e7f8c0d [revision 2623] Author: Martin Storsjö Date: Tue Aug 25 14:38:20 2015 +0300 checkasm: aarch64: Check register clobbering Disable this on iOS, since it has got a slightly different ABI for vararg parameters. commit 5c13589be828b524100c787057d6bef77898c657 [revision 2622] Author: Martin Storsjö Date: Tue Aug 25 23:36:45 2015 +0300 arm: Implement x284_decimate_score15/16/64_neon checkasm timing Cortex-A7 A8 A9 decimate_score15_c 764 736 535 decimate_score15_neon 487 494 453 decimate_score16_c 782 727 553 decimate_score16_neon 487 494 521 decimate_score64_c 2361 2597 2011 decimate_score64_neon 1017 802 785 commit 3902ae02a0edede5d6c44cb3ee9e24e618c66e6a [revision 2621] Author: Martin Storsjö Date: Tue Aug 25 23:36:44 2015 +0300 arm: Implement chroma intra deblock checkasm timing Cortex-A7 A8 A9 deblock_chroma_420_intra_mbaff_c 1469 1276 1181 deblock_chroma_420_intra_mbaff_neon 981 717 644 deblock_chroma_intra[1]_c 2954 2402 2321 deblock_chroma_intra[1]_neon 947 581 575 deblock_h_chroma_420_intra_c 2859 2509 2264 deblock_h_chroma_420_intra_neon 1480 1119 1028 deblock_h_chroma_422_intra_c 6211 5030 4792 deblock_h_chroma_422_intra_neon 2894 1990 2077 commit e8b95e92792d9353277995043757430cf3dc3bf7 [revision 2620] Author: Martin Storsjö Date: Tue Aug 25 14:38:17 2015 +0300 arm: Implement x264_pixel_sa8d_satd_16x16_neon This requires spilling some registers to the stack, contray to the aarch64 version. checkasm timing Cortex-A7 A8 A9 sa8d_satd_16x16_neon 12936 6365 7492 sa8d_satd_16x16_separate_neon 14841 6605 8324 commit 6bbaa2758d53d0d6d645142d7d818c960d137a0e [revision 2619] Author: Martin Storsjö Date: Tue Aug 25 14:38:16 2015 +0300 arm: Implement x264_deblock_h_chroma_mbaff_neon checkasm timing Cortex-A7 A8 A9 deblock_chroma_420_mbaff_c 1944 1706 1526 deblock_chroma_420_mbaff_neon 1210 873 865 commit 3c66591e859045ef79a7131b991a5f20c80ffbb4 [revision 2618] Author: Martin Storsjö Date: Tue Aug 25 14:38:15 2015 +0300 arm: Implement x264_deblock_h_chroma_422_neon checkasm timing Cortex-A7 A8 A9 deblock_h_chroma_422_c 6953 6269 5145 deblock_h_chroma_422_neon 3905 2569 2551 commit 5265b927b0f2e043dd39cbbbf3909da0862d60e6 [revision 2617] Author: Martin Storsjö Date: Tue Aug 25 14:38:14 2015 +0300 arm: Implement integral_init4/8h/v_neon checkasm timing Cortex-A7 A8 A9 integral_init4h_c 10466 8590 6161 integral_init4h_neon 3021 1494 1800 integral_init4v_c 16250 13590 13628 integral_init4v_neon 3473 2073 3291 integral_init8h_c 10100 8275 5705 integral_init8h_neon 4403 2344 2751 integral_init8v_c 6403 4632 4999 integral_init8v_neon 1184 783 1306 commit b08403b5593307b919bfe5bfbd743da825326a4c [revision 2616] Author: Martin Storsjö Date: Tue Aug 25 14:38:13 2015 +0300 arm: Implement x264_denoise_dct_neon checkasm timing Cortex-A7 A8 A9 denoise_dct_c 6604 5510 5858 denoise_dct_neon 1774 1139 1614 commit ceee976bde76a5f4126bfd9d8454f0e601e67204 [revision 2615] Author: Martin Storsjö Date: Tue Aug 25 14:38:12 2015 +0300 arm: Add x264_nal_escape_neon checkasm timing Cortex-A7 A8 A9 nal_escape_c 852758 879566 655497 nal_escape_neon 376831 450678 371673 commit 8feb733ed1dcb1cc94df3b0e6c98009832ea85cc [revision 2614] Author: Martin Storsjö Date: Tue Aug 25 14:38:11 2015 +0300 arm: Add neon versions of vsad, asd8 and ssd_nv12_core These are straight translations of the aarch64 versions. checkasm timing Cortex-A7 A8 A9 vsad_c 16234 10984 9850 vsad_neon 2132 1020 789 asd8_c 5859 3561 3543 asd8_neon 1407 1279 1250 ssd_nv12_c 608096 591072 426285 ssd_nv12_neon 72752 33549 41347 commit 42b3b398664349d23b2122ac940417165424542d [revision 2613] Author: Martin Storsjö Date: Tue Aug 25 14:38:10 2015 +0300 checkasm: Check the right output range for integral_initXh These functions write their output into sum+stride, while we previously only checked [0..stride-8] within the sum array. This catches the previously broken aarch64 version of these functions. Also check up until stride-4 elements for init4h. commit 3d86abab097fa26d116112f188458269c6a0415f [revision 2612] Author: Janne Grunau Date: Thu Aug 20 13:55:54 2015 +0200 aarch64: Skip deblocking in 264_deblock_h_chroma_422_neon If the parameters (alpha, beta, tc0[]) indicated that the deblocking should have been skipped, every 2nd chrome line would have deblocked anyway. deblock_h_chroma_422_neon: 2259 (before) deblock_h_chroma_422_neon: 2192 (after) commit aec81efd3fe43008551916aa6073eb0732a58210 [revision 2611] Author: Janne Grunau Date: Mon Aug 17 16:39:20 2015 +0200 aarch64: Optimize various intra_predict asm functions Make them at least as fast as the compiled C version (tested on cortex-a53 vs. gcc 4.9.2). C NEON (before) NEON (after) intra_predict_4x4_dc: 260 335 260 intra_predict_4x4_dct: 210 265 200 intra_predict_8x8c_dc: 497 548 493 intra_predict_8x8c_v: 232 309 179 (arm64) intra_predict_8x16c_dc: 795 830 790 commit b16268ac0826d78455d0d704ea0fc8b1edc6b6bf [revision 2610] Author: Janne Grunau Date: Tue Aug 18 10:25:10 2015 +0200 aarch64: Faster intra_predict_4x4_h Use multiplication with 0x01010101 for splats. On a cortex-a53: gcc 4.9.2 llvm 3.6 neon (before) neon (after) intra_predict_4x4_h: 162 147 160/155 139/135 commit f2a6be92e5e42e8ef1daf74f63dbdbc4819d2070 [revision 2609] Author: Janne Grunau Date: Tue Aug 18 10:25:09 2015 +0200 aarch64: Fix coeff_level_run* macros with LLVM's assembler LLVM's integrated assembler does not treat symbols as integer constants. commit 592e92e9a8e47c3f0d0017c8158df5a4830e0bbd [revision 2608] Author: Janne Grunau Date: Tue Aug 18 10:25:08 2015 +0200 aarch64: Remove commas LLVM's assembler complains about commit 6efb57ada652fd015ec4cacffd09282632bb975b [revision 2607] Author: Martin Storsjö Date: Thu Aug 13 23:59:31 2015 +0300 arm: Implement x264_sub8x16_dct_dc_neon checkasm timing Cortex-A7 A8 A9 sub8x16_dct_dc_c 6386 3901 4080 sub8x16_dct_dc_neon 1491 698 917 commit 89439b2c604c81e13eb3da9e692d2cdae5a18b53 [revision 2606] Author: Martin Storsjö Date: Thu Aug 13 23:59:28 2015 +0300 arm: Optimize x264_deblock_h_chroma_neon Shuffle both chroma components together as a 16 bit unit, and don't write the unchanged columns (like in x264_deblock_h_luma_neon and in the aarch64 version of the function). This causes a minor slowdown for x264_deblock_v_chroma_neon, but it is negligible compared to the speedup. checkasm timing Cortex-A7 A8 A9 deblock_chroma[1]_c 4817 4057 3601 deblock_chroma[1]_neon 1249 716 817 (before) deblock_chroma[1]_neon 1249 766 845 (after) deblock_h_chroma_420_c 3699 3275 2830 deblock_h_chroma_420_neon 2068 1414 1400 (before) deblock_h_chroma_420_neon 1838 1355 1291 (after) commit ff71457d71c5c11ed825d848677cab09c7639012 [revision 2605] Author: Martin Storsjö Date: Thu Aug 13 23:59:27 2015 +0300 aarch64: Remove leftover commented out code commit ef6034812162fc8b51bfd5e87387f405d1cc30cb [revision 2604] Author: Martin Storsjö Date: Thu Aug 13 23:59:26 2015 +0300 aarch64: Simplify the decimate_score functions After doing a left shift by the number of bits returned by clz, only bits set to zero can be shifted out, so if the register was nonzero to start with (which is checked), it can't become zero here. commit d2b04a26b26d02c41ffb05cf1a605dafe9e6fa59 [revision 2603] Author: Martin Storsjö Date: Thu Aug 13 23:59:25 2015 +0300 arm: Use aligned loads in x264_coeff_last15_neon After subtracting 2, the pointer will be aligned. checkasm timing Cortex-A7 A8 A9 coeff_last15_c 423 375 230 coeff_last15_neon 350 420 404 (before) coeff_last15_neon 350 400 394 (after) commit 3f89a6bbee061cb0361770cf5b8495448515a011 [revision 2602] Author: Martin Storsjö Date: Thu Aug 13 23:59:24 2015 +0300 arm: Simplify x264_predict_8x8c_p_neon This gets rid of a few unnecessary (and confusing) steps in calculating the increment to i00. checkasm timing Cortex-A7 A8 A9 intra_predict_8x8c_p_c 5525 4732 4755 intra_predict_8x8c_p_neon 1719 1140 1262 (before) intra_predict_8x8c_p_neon 1663 1142 1255 (after) commit a0cd7d38acb6c31973228ab207e18344920e0aa3 [revision 2601] Author: Vittorio Giovara Date: Tue Sep 15 15:40:14 2015 +0200 lavf: Use the prefixed name for pixel format enum commit 63555e696a997ff795798d3357d770f8ab373cd9 [revision 2600] Author: Janne Grunau Date: Thu Sep 3 00:21:58 2015 +0200 aarch64: fix x264_mbtree_propagate_cost_neon The branch conditon caused the loop to execute one time more than intended. Detected by a memory corruption on arm with the 1 to 1 port of the function. commit 5c4728d8dd82ba46901824470db1609ae0f2521d [revision 2599] Author: Martin Storsjö Date: Thu Aug 13 23:59:22 2015 +0300 aarch64: Fix integral_init4/8h_neon The stride is the number of uint16_t elements and thus needs to be shifted. This issue had slipped unnoticed since checkasm didn't actually verify the output of these functions. commit 67076513267907b5601828ae6864cc063c8c7548 [revision 2598] Author: Henrik Gramner Date: Thu Aug 27 19:53:00 2015 +0200 x86: Fix integral_init4/8h_avx2 The AVX2 implementation was using the wrong offsets. It went undetected due to the checkasm test being incorrect. commit e86f3a1993234e8f26050c243aa253651200fa6b [revision 2597] Author: Mark Webster Date: Wed Aug 5 04:28:17 2015 +0100 Simplify inclusion of x264.h in C++ projects Name all structs to support forward declarations. Add a conditional extern "C" wrapper in x264.h itself instead of having to specify it in every location where it's included. commit 401941cc7099b322864600b62104940542497e7a [revision 2596] Author: Henrik Gramner Date: Sun Aug 16 21:59:26 2015 +0200 checkasm: Properly save rdx/edx in checkasm_call() on x86 If the return value doesn't fit in a single register rdx/edx can in some cases be used in addition to rax/eax. Doesn't affect any of the existing checkasm tests but it's more correct behavior and it might be useful in the future. commit 3dff8af3033a9e81d7966c5749fd361ce421467a [revision 2595] Author: Henrik Gramner Date: Tue Aug 11 17:19:35 2015 +0200 x86: Enable SSE2 by default on x86-32 It makes more sense to tune the defaults to benefit the vast majority of users. Anyone still using a Pentium III for video encoding is of course free to explicitly set different flags when compiling. commit 51d8aa09b777dc2969deaa954d5f6af9836c02ba [revision 2594] Author: Henrik Gramner Date: Mon Aug 10 22:30:21 2015 +0200 msvs/icl: Improve default CFLAGS Use -fp:fast as a substitute for -ffast-math. Increase warning level from -W0 to -W1 (the default setting). Disable -GS (stack cookies) on MSVS. It's disabled by default on ICL. commit 7edaf4b966aaee098ff301436f8d2b33a6fe5983 [revision 2593] Author: Henrik Gramner Date: Wed Aug 12 22:23:31 2015 +0200 Use a relative $SRCPATH for out-of-tree builds Fixes out-of-tree MSVS builds on Cygwin. commit e7b4b863dc2555ed835569c400d3a30f7ddc15ff [revision 2592] Author: Henrik Gramner Date: Sat Aug 8 22:26:38 2015 +0200 cygwin: Enable MSVS support `cl -showIncludes` creates absolute Windows paths for some files, attempt to convert those to Unix paths. Use relative paths for dependencies located in or below the working directory in order to mimic the behavior of gcc and to make the paths more readable. Make the dependency generation script a bit more robust in general. commit 817a4414b98e8a511c626932e7d433388bc96507 [revision 2591] Author: Henrik Gramner Date: Sat Aug 8 18:34:21 2015 +0200 cltostr.sh: Minor fixes commit 1a3d963441eaad25972763423d60158f597c5f65 [revision 2590] Author: Henrik Gramner Date: Sat Aug 8 12:21:54 2015 +0200 Simplify version.sh Also remove some non-POSIX syntax and improve robustness. As a bonus the script now runs about 2-3 times faster. `git rev-list --count` could be used to simplify things even further, but that functionality was added in git 1.7.2 so keep `wc -l` for now to maintain compatibility with older git versions. commit f7f6af76ef22e812ef330e2839488e83dd553836 [revision 2589] Author: 장영훈 Date: Fri Aug 7 14:43:24 2015 +0900 msvs: Fix cl detection in non-English environments commit e1a55bbbff2b4460ceb843f163e349fed7d32969 [revision 2588] Author: Henrik Gramner Date: Mon Aug 3 21:05:11 2015 +0200 x86inc: Sync minor changes from ffmpeg/libav commit 36f537b141da076032fd11f1745bb62d466dd7bf [revision 2587] Author: Henrik Gramner Date: Wed Jul 29 19:30:52 2015 +0200 matroska: Add comments for the remaining element names commit f04062e6380cbe10453dab33a3575c373e63ff9b [revision 2586] Author: Henrik Gramner Date: Wed Jul 29 19:30:41 2015 +0200 Silence various static analyzer warnings Those are false positives, but it doesn't hurt to get rid of them. commit b1cbf7ebe4a192bbc25cc910cb2910a34992f807 [revision 2585] Author: Henrik Gramner Date: Sun Jul 26 23:13:29 2015 +0200 mingw: Enable the tsaware linker flag Avoids an irrelevant compatibility layer in Terminal Services environments. https://msdn.microsoft.com/en-us/library/cc834995.aspx commit 8a1ff031ecd4b423fc373540b9b68cdf97602bbf [revision 2584] Author: Henrik Gramner Date: Sun Jul 26 23:13:26 2015 +0200 msvs: Don't redefine snprintf for VS2015 Visual Studio 2015 has a proper snprintf implementation. commit aa9d22927c0264c08c11c9e72294fc651a155b3e [revision 2583] Author: Henrik Gramner Date: Sun Jul 26 23:13:19 2015 +0200 msvs: Prefer link.exe from the same directory as cl.exe /usr/bin/link from coreutils may be located before the MSVS linker in $PATH which causes linking to fail due to using the wrong binary. commit ca8bd68063d74227d917f34fd50942265f9a106c [revision 2582] Author: Henrik Gramner Date: Mon Jul 27 00:10:00 2015 +0200 frame_dump: check fseek() return value commit 53b3b747e22f53204f6efb5106ab4a5a8eb57626 [revision 2581] Author: Henrik Gramner Date: Mon Jul 27 00:08:38 2015 +0200 x264_vfprintf: use va_copy It's undefined behavior to use the same va_list twice. This most likely didn't cause any issues in practice since the string would have to be larger than 4 KiB to trigger the fallback path. Use workaround for ICL as it doesn't define va_copy even for C99. commit 59e7ded846a832125cb533aadff9895487771ea7 [revision 2580] Author: Henrik Gramner Date: Mon Jul 27 00:08:31 2015 +0200 param_parse: Fix framerate rounding issues commit 73ae2d11d472d0eb3b7c218dc1659db32f649b14 [revision 2579] Author: Marcin Juszkiewicz Date: Mon Jun 1 11:24:45 2015 +0200 aarch64: Remove broken CFLAGS in configure GCC doesn't have an "-arch" switch, but works when that entire line is removed. commit cc002bd545b008b1cdc7c6d7cc0c616ba125d4d5 [revision 2578] Author: Rong Yan Date: Mon Jul 20 03:34:20 2015 -0500 ppc: Add little-endian PowerPC support commit 145f3a6275802a649b8dedb49bb0e054caf31717 [revision 2577] Author: Rishikesh More Date: Thu Jun 18 17:48:46 2015 +0530 mips: MSA quant optimizations Signed-off-by: Rishikesh More commit 16395d2b6f827b076612eb5b70711b79621da67e [revision 2576] Author: Rishikesh More Date: Thu Jun 18 17:48:45 2015 +0530 mips: MSA predict optimizations Signed-off-by: Rishikesh More commit 204e1a60237e0b3168ccbdb2905c9af8188b90ee [revision 2575] Author: Rishikesh More Date: Thu Jun 18 17:48:44 2015 +0530 mips: MSA pixel optimizations Signed-off-by: Rishikesh More commit 3ce6430eb11839c69d606c59c0f8c31ce0b6dd17 [revision 2574] Author: Rishikesh More Date: Thu Jun 18 17:48:43 2015 +0530 mips: MSA deblock optimizations Signed-off-by: Rishikesh More commit 57618eead025eaf654226add94689d6d2999ccf6 [revision 2573] Author: Rishikesh More Date: Thu Jun 18 17:48:42 2015 +0530 mips: MSA dct optimizations Signed-off-by: Rishikesh More commit 4ebb23aaf4f46b7a04aa8aefa3c08e7b6493de4c [revision 2572] Author: Rishikesh More Date: Thu Jun 18 17:48:40 2015 +0530 mips: MSA mc optimizations Signed-off-by: Rishikesh More commit cd19444d3f9915a5a33a95e308bc8021d7e62afe [revision 2571] Author: Rishikesh More Date: Thu Jun 18 17:48:38 2015 +0530 mips: Common MSA macros Add macros for load/store, slide, shift, transpose and basic arithmetic operations required by subsequent patches. Signed-off-by: Rishikesh More commit 72b82bd98a99b1d75322b70a74365547382ce062 [revision 2570] Author: Rishikesh More Date: Tue May 12 19:38:09 2015 +0530 mips: Add MSA support to checkasm Signed-off-by: Rishikesh More commit ce0757d9d2778e349a7c2f6445b6aa75d8765c30 [revision 2569] Author: Kaustubh Raste Date: Fri Apr 17 17:38:58 2015 +0530 mips: Initial MSA support MSA is the MIPS SIMD Architecture. Add X264_CPU_MSA define. Update configure to detect MIPS platform and set flags. CPU-specific gcc options are expected through --extra-cflags. Sample command line for mips32r5: ./configure --host=mipsel-linux-gnu --cross-prefix=/mips-mti-linux-gnu- --extra-cflags="-EL -mips32r5 -msched-weight -mload-store-pairs" Signed-off-by: Kaustubh Raste commit 9140ee1fb39bd4a4ccace28091398e8a96704f07 [revision 2568] Author: Anton Mitrofanov Date: Fri Jul 17 00:22:29 2015 +0300 Limit autodetection of threads number according to the source height commit aeaed2d07b5b43437bb640e1f987d42a6fab03b9 [revision 2567] Author: Anton Mitrofanov Date: Thu Jul 16 19:04:59 2015 +0300 Fine-tune of frame's size predictors at ratecontrol start This is attempt to improve VBV at start of video with a lot of threads which delay feedback for predictors. commit aa275158641e94203003157947d43ff4cc685068 [revision 2566] Author: Anton Mitrofanov Date: Thu Jul 16 16:15:56 2015 +0300 Use forced frame types in slicetype analysis This should improve MBTree and VBV when a lot of forced frame types are used. commit a83edfa053f60ad0c8a164f31e7492a680eef361 [revision 2565] Author: Henrik Gramner Date: Mon Dec 1 22:05:42 2014 +0100 x86: SSSE3 and AVX2 implementations of plane_copy_swap For NV21 input. commit 627f891c571cacb51deb5e211b23c309b14a6587 [revision 2564] Author: Yu Xiaolei Date: Fri Jun 6 16:05:27 2014 +0800 NV21 input support Eliminates an extra copy when encoding Android camera preview images. Checkasm test by Janne Grunau. ARM assembly with improvements from Janne Grunau. commit 6ee94dc898dc029553e308f1e76891ccefb3f0a7 [revision 2563] Author: Henrik Gramner Date: Tue Jun 23 17:00:47 2015 +0200 deblock: Write combining commit 08a9c51919f4edbd6e484155e5521a92a0800651 [revision 2562] Author: Henrik Gramner Date: Tue Jun 23 14:59:59 2015 +0200 Get rid of some tabs and trailing whitespaces commit b568a256b9bc6c500d7b1ffe4b9c3311ee5ff337 [revision 2561] Author: Henrik Gramner Date: Sat May 23 19:44:16 2015 +0200 x86: Experimental nasm support Enables the use of nasm as an alternative to yasm. Note that nasm cannot assemble x264 with PIC enabled since it currently doesn't support [symbol-$$] addressing which is used extensively by x264's PIC code. This includes all 64-bit Windows and 64-bit OS X builds, even non-shared. For the above reason nasm is currently intentionally not auto-detected, instead the assembler must be explicitly specified using "AS=nasm ./configure". Also drop -O2 from ASFLAGS since it's simply ignored anyway. commit d14e38c059c9a2aecc82477b99d56ef74eb731ec [revision 2560] Author: Timothy Gu Date: Tue May 26 19:12:42 2015 +0200 x86inc: Prevent warnings when using `struc` and `endstruc` struc and endstruc attempts to revert to the previous section state set by the SECTION macro. Use the primitive [SECTION] directive instead of the SECTION macro for the .note.GNU-stack section to prevent it from being emitted again during endstruc. commit 353b1f888c34081e94727a1ffa0e4920e2cfe8a9 [revision 2559] Author: Henrik Gramner Date: Wed May 27 21:38:14 2015 +0200 x86inc: Drop SECTION_TEXT macro The .text section is already 16-byte aligned by default on all supported platforms so `SECTION_TEXT` isn't any different from `SECTION .text`. commit b615f82e45c88b7915c5571ad09fa65a0b6130d7 [revision 2558] Author: Henrik Gramner Date: Sat May 23 13:38:05 2015 +0200 x86inc: Disable vpbroadcastq workaround in newer yasm versions The bug was fixed in 1.3.0, so only perform the workaround in earlier versions. commit 8f834d6ccc054d8c32d84310664dc07abac553ec [revision 2557] Author: Henrik Gramner Date: Sun May 24 22:57:00 2015 +0200 Prefer Unicode versions of Windows API calls Just for consistency, doesn't affect behavior. commit 3f8c8eb1758d0fa890538eba6f5e699c93dc1304 [revision 2556] Author: Henrik Gramner Date: Sun May 24 23:21:20 2015 +0200 Get rid of fPIC warnings when compiling a shared library on Windows PIC is always enabled when compiling for Windows so gcc complains when using -fPIC since it doesn't do anything. commit 0c21480fa2fdee345a3049e2169624dc6fc2acfc [revision 2555] Author: Henrik Gramner Date: Sat Jul 25 22:42:59 2015 +0200 matroska: Write the correct DocTypeVersion when using frame-packing The StereoMode element is only valid with DocTypeVersion 3 or higher. commit 791d265281af1d022a72ba9e003a987e97da5c0d [revision 2554] Author: Anton Mitrofanov Date: Sat Jul 25 00:21:52 2015 +0300 dump_yuv: Fix file handle leak commit d6aa586b2f83eeb776744c2e97a8ce9e1181c59b [revision 2553] Author: Anton Mitrofanov Date: Sat Jul 25 00:20:47 2015 +0300 mp4: Fix file handle leak commit 942e4e4530d0909c2b580be88acd18d1e5fa4fa8 [revision 2552] Author: Henrik Gramner Date: Wed Jun 24 00:40:45 2015 +0200 flv: Check fseek() and fwrite() return values commit 250d5b0e13045f6a1ebfeb379933b5c5daa9cf41 [revision 2551] Author: Henrik Gramner Date: Wed Jun 24 00:22:56 2015 +0200 flv: Fix memory and file handle leaks commit 3533520655ef095ef009af9b6b27a20b45fd13ee [revision 2550] Author: Henrik Gramner Date: Wed Jun 24 01:23:35 2015 +0200 avs: Fix file handle leak commit df152a77e1b17065aecb40c9a2a28d5953887ac9 [revision 2549] Author: Henrik Gramner Date: Tue Jun 23 13:38:02 2015 +0200 matroska: Fix memory leak commit 6d5249977f5d62f6e167a062bdd94d8546eca1f7 [revision 2548] Author: Henrik Gramner Date: Tue Jun 23 13:24:29 2015 +0200 rdo: Fix potential CAVLC overflow issues commit 936e8da1a4f9d0431b181d0877bb1602d4de9441 [revision 2547] Author: Henrik Gramner Date: Tue Jun 23 22:08:35 2015 +0200 slurp_file: Various minor bug fixes * Fix unsigned <= 0 check. * Add additional size sanity check on 32-bit systems. * Don't read uninitialized data if fread() fails. commit d302526d5b97818f588b86f408f910924790242e [revision 2546] Author: Henrik Gramner Date: Tue Jun 23 22:47:53 2015 +0200 param_parse: Check strdup() return value commit 94e476d80b9635508907893c97e8f8d9f0bc9ddf [revision 2545] Author: Henrik Gramner Date: Tue Jun 23 15:38:16 2015 +0200 param_parse: Fix memory leak commit 45856b9787eab95434d66b4bc2e18819483f0e43 [revision 2544] Author: Anton Mitrofanov Date: Fri Jun 19 16:01:12 2015 +0300 Add FreeBSD's stdint.h header guard to allowed list Patch written by Koop Mast commit 35cf1a2cbf253e43cab7747eb903a3b844bd42c1 [revision 2543] Author: Henrik Gramner Date: Fri May 22 19:23:33 2015 +0200 x86: Prevent overread of src in plane_copy_interleave Could only occur in 4:2:2 with height == 1. Also enable asm for inputs with different U/V strides as long as the strides have identical signs. commit 003414a4b3724f0972e4507dfd1432dd442d2228 [revision 2542] Author: Anton Mitrofanov Date: Wed May 20 23:10:20 2015 +0300 checkasm: Fix incorrect memcmp size for ARM architecture commit e08fdc81018489217f4bafe7321a3baf372fac1f [revision 2541] Author: Anton Mitrofanov Date: Sun Apr 26 20:51:05 2015 +0300 Fix possible use of uninitialized MVs in lookahead analysis for B-frames commit 0b0210857ef13214f12861dec672006455a556d6 [revision 2540] Author: Anton Mitrofanov Date: Tue Apr 21 23:08:19 2015 +0300 Catch incorrect usage of libx264 API for delayed frames flushing commit 3a6bd39a650b47572743c2d2ea2fd7c214053fb2 [revision 2539] Author: Anton Mitrofanov Date: Sat Mar 7 23:00:09 2015 +0300 Fix detection of system libx264 configuration commit 121396c71b4907ca82301d1a529795d98daab5f8 [revision 2538] Author: Anton Mitrofanov Date: Mon Feb 23 14:23:18 2015 +0300 Cosmetic changes commit 8e71b432e5dbe835fa4516064f6841a03c79b183 [revision 2537] Author: Anton Mitrofanov Date: Wed Dec 31 02:15:05 2014 +0300 Update configure for auto detection of system libx264 configuration commit 0f84192e88d6adc4512f6f320a50a09b4608634c [revision 2536] Author: Anton Mitrofanov Date: Tue Feb 3 14:51:28 2015 +0300 Add tile format frame packing value Defined in 2014-02 edition. commit f08b1c6b8e186ff5a931e9a80e8923e42efff0e4 [revision 2535] Author: Anton Mitrofanov Date: Tue Feb 3 13:39:14 2015 +0300 Stricter validation of crop-rect values commit 196cb9ab52af9370fc66a474ffc4a52a75dc5eb4 [revision 2534] Author: Vittorio Giovara Date: Tue Jan 20 16:15:56 2015 +0000 Add mono frame packing value Defined in 2013-04 edition. commit c8a773ebfca148ef04f5a60d42cbd7336af0baf6 [revision 2533] Author: Vittorio Giovara Date: Tue Jan 20 15:57:41 2015 +0000 Validate frame packing value instead of clipping commit a95584945dd9ce3acc66c6cd8f6796bc4404d40d [revision 2532] Author: Christophe Gisquet Date: Tue Feb 3 20:40:41 2015 +0100 x86inc: Correctly warn on use of SSE2 instructions in SSE functions SSE2 instructions that are XMM-implementations of pre-existing MMX/MMX2 instructions did not issue warnings when used in SSE functions. Handle it by also checking the register type when such instructions are used. commit 23d4434de9ab5ef32ebb03401d971b8579a65fc6 [revision 2531] Author: Christophe Gisquet Date: Tue Feb 3 18:02:30 2015 +0100 x86inc: Fix instantiation of YMM registers commit 4c75f3d729aaf3bcb00edf789c71f09495374bdf [revision 2530] Author: Vittorio Giovara Date: Tue Jan 20 16:28:54 2015 +0000 matroska: Correctly write display width and height in stereo mode According to the specifications, when stereo mode is set, these values represent the single view size. commit c3ba2a8c595b1bb36da55b82f7f4046471349d0e [revision 2529] Author: Kieran Kunhya Date: Tue Jan 20 09:38:00 2015 -0600 Use POC type 0 for AVC-Intra Based on a patch from Capella Systems commit b77cc09b9252d70f78726f2472391b63948d9895 [revision 2528] Author: Anton Mitrofanov Date: Sat Jan 3 15:46:19 2015 +0300 Fix ARCH variable name conflict with BSD ports (bsd.port.mk) read-only variable commit 6e769846626f9185b59f3967e8b4ebe11497d878 [revision 2527] Author: Anton Mitrofanov Date: Sat Dec 27 20:35:39 2014 +0300 Fix negative percentages in final stats output They were caused by integer overflow when encoding long UHD video. commit d7ccd89f1bea53c8c524f8e6eb963d57defb6813 [revision 2526] Author: Anton Mitrofanov Date: Sat Jan 3 23:35:23 2015 +0300 Bump dates to 2015 commit 40bb56814e56ed342040bdbf30258aab39ee9e89 [revision 2525] Author: Anton Mitrofanov Date: Mon Dec 15 18:49:23 2014 +0300 x86: Update intel compiler cpu dispatcher override for new versions of ICC/ICL commit d72a85b549acd981a8dae3dc5b71920ab2aeea4f [revision 2524] Author: Anton Mitrofanov Date: Tue Sep 6 21:53:29 2011 +0400 New AQ mode: auto-variance AQ with bias to dark scenes Also known as --aq-mode 3 or auto-variance AQ modification. commit f4a455a43df3088bae5208dcc98b8f6214fdce7d [revision 2523] Author: Anton Mitrofanov Date: Wed Aug 29 03:02:27 2012 +0400 Improve HRD conformance commit fa3549b5f2478f39cbcbd14d2e956e59f70d18eb [revision 2522] Author: Henrik Gramner Date: Fri Nov 28 23:24:56 2014 +0100 x86: SSE and AVX implementations of plane_copy Also remove the MMX2 implementation and fix src overread for height == 1. commit 8797e0f8d416aadb91d359f144e4e7855071870a [revision 2521] Author: Anton Mitrofanov Date: Mon Sep 29 23:26:19 2014 +0400 Update to the latest version of gas-preprocessor.pl from http://git.libav.org/?p=gas-preprocessor.git Contributions by Janne Grunau, Martin Storsjo, Mans Rullgard, David Conrad, Martin Aumuller and others commit 59b9c252cfa6242c7fa6424a463e51913996fe6a [revision 2520] Author: Janne Grunau Date: Wed Nov 19 00:33:55 2014 +0100 aarch64: cabac_encode_{decision,bypass,terminal}_asm benchmarks on a Nexus 9 (nvidia denver): 101.3 cycles in x264_cabac_encode_decision_c, 67105369 runs, 3495 skips 97.3 cycles in x264_cabac_encode_decision_asm, 67105493 runs, 3371 skips 132.8 cycles in x264_cabac_encode_terminal_c, 1046950 runs, 1626 skips 116.1 cycles in x264_cabac_encode_terminal_asm, 1048424 runs, 152 skips 92.4 cycles in x264_cabac_encode_bypass_c, 16776192 runs, 1024 skips 89.6 cycles in x264_cabac_encode_bypass_asm, 16776453 runs, 763 skips Cycle counts are not as stable as one would like. The dynamic code optimisation seems to produce different results for small chnages in a binary. Repeated runs with the same binary produce stable results though (ignoring the first run). commit a6ec424939a4d3a59e4ec1e3999cb37e4314408e [revision 2519] Author: Janne Grunau Date: Thu Nov 6 09:20:17 2014 +0100 checkasm: add cycle counter read for aarch64 Needs kernel support since user space access to the cycle counter is not allowed on all available AArch64 systems (Android 5 and iOS). commit fa7e9d3d082327ceeacfaf85da6cde4c50fb4e5b [revision 2518] Author: Janne Grunau Date: Wed Nov 5 11:35:13 2014 +0100 aarch64: nal_escape_neon 3-4 times faster. commit f13573e490d9f18bbcb10409fb09ec25e477035e [revision 2517] Author: Janne Grunau Date: Fri Oct 31 14:49:04 2014 +0100 aarch64: {plane_copy,memcpy_aligned,memzero_aligned}_neon 2-3 times faster than C. commit 8d655b63b4f7bc021ad038ea64b7c4de9d0ef74b [revision 2516] Author: Janne Grunau Date: Wed Oct 29 18:17:48 2014 +0100 aarch64: x264_mbtree_propagate_{cost,list}_neon x264_mbtree_propagate_cost_neon is ~7 times faster. x264_mbtree_propagate_list_neon is 33% faster. commit 4d400a6ec67f17ae3b17876b0318b956b6d5c856 [revision 2515] Author: Janne Grunau Date: Tue Oct 21 15:18:49 2014 +0200 aarch64: x264_denoise_dct_neon 3.5 times faster. commit 4e8ac132cc2feff5786d12c90fd62cf97979bae1 [revision 2514] Author: Janne Grunau Date: Mon Oct 20 13:12:14 2014 +0200 aarch64: x264_coeff_level_run{4,8,15,16} All functions ~33% faster. commit dd7666742d5a1a7af076fb388c6adf1b10dcdb3e [revision 2513] Author: Janne Grunau Date: Tue Oct 14 19:20:52 2014 +0200 aarch64: NEON asm for intra luma deblocking deblock_luma_intra[0]_neon is 2 times fastes, deblock_luma_intra[1]_neon is ~4 times faster. commit 0122fd230cbf7351845dd354d5ee883d741222ef [revision 2512] Author: Janne Grunau Date: Mon Oct 13 17:29:22 2014 +0200 aarch64: x264_deblock_h_chroma_422_neon deblock_h_chroma_422 2.5 times faster commit 44cb1dcdbdaafeddd98d2ebe3d02408bc380713e [revision 2511] Author: Janne Grunau Date: Mon Oct 13 12:43:50 2014 +0200 aarch64: x264_deblock_h_chroma_mbaff_neon deblock_chroma_420_mbaff_neon 2 times faster commit f2e439d113ae86a0a1ef8215d4d4111892aed3f7 [revision 2510] Author: Janne Grunau Date: Fri Oct 10 10:29:15 2014 +0200 aarch64: NEON asm for intra chroma deblocking deblock_h_chroma_420_intra, deblock_h_chroma_422_intra and x264_deblock_h_chroma_intra_mbaff_neon are ~3 times faster. deblock_chroma_intra[1] is ~4 times faster than C. commit ce6c94c0bef3350e9546302aae5909404b056fdb [revision 2509] Author: Janne Grunau Date: Tue Sep 2 10:27:22 2014 +0200 aarch64: add myself as author to aarch64/mc.h commit be7e5fa6eee2731abdb1b41bc2a4c1a29e672747 [revision 2508] Author: Janne Grunau Date: Thu Aug 14 14:22:50 2014 +0100 aarch64: NEON asm for integral init integral_init4h_neon and integral_init8h_neon are 3-4 times faster than C. integral_init8v_neon is 6 times faster and integral_init4v_neon is 10 times faster. commit eb1d35725e542968c4a6480c157db40570477a95 [revision 2507] Author: Janne Grunau Date: Wed Aug 13 13:30:53 2014 +0100 aarch64: NEON asm for 8x16c intra prediction Between 10% and 40% faster than C. commit 40d5db342b7f5198db9826a51f31e454bd208596 [revision 2506] Author: Janne Grunau Date: Tue Aug 12 17:26:10 2014 +0200 aarch64: NEON asm for decimate_score decimate_score15 and 16 are 60% faster, decimate_score64 is 4 times faster than C. commit 45e1ebf88a1c3bf37e1326ce621a9b735d155885 [revision 2505] Author: Janne Grunau Date: Fri Aug 8 11:19:35 2014 +0100 aarch64: implement x264_sub8x16_dct_dc_neon 4 times faster than C. commit 90f0b5c1c881f345c9da15bc482055f2a92f8ceb [revision 2504] Author: Janne Grunau Date: Thu Aug 7 19:46:07 2014 +0200 aarch64: implement x264_pixel_asd8_neon 7 times faster than C. commit f8f8d13d5978b13fc831e041e52aa617550bbdf3 [revision 2503] Author: Janne Grunau Date: Thu Aug 7 16:49:12 2014 +0200 aarch64: NEON asm for 4x16 sad, satd and ssd pixel_sad_4x16_neon: 33% faster than C pixel_satd_4x16_neon: 5 times faster pixel_ssd_4x16_neon: 4 times faster commit 35b91f2410dcf4fc5191dd85ccda7a42eb01eae8 [revision 2502] Author: Janne Grunau Date: Wed Jul 30 15:48:25 2014 +0100 aarch64: implement x264_pixel_ssd_nv12_core_neon 13 times faster than C. commit 99a1ca1f1a62d51e47d1ac2c92ee9c3bf3b5712b [revision 2501] Author: Janne Grunau Date: Tue Jul 29 18:26:11 2014 +0100 aarch64: implement x264_pixel_vsad_neon 35 times faster than C. commit 6c1632493e5afac8be1e1693377dab27f4704a1d [revision 2500] Author: Janne Grunau Date: Tue Jul 29 11:06:24 2014 +0100 aarch64: NEON asm for missing x264_zigzag_* functions zigzag_scan_4x4_field_neon, zigzag_sub_4x4_field_neon, zigzag_sub_4x4ac_field_neon, zigzag_sub_4x4_frame_neon, igzag_sub_4x4ac_frame_neon more than 2 times faster zigzag_scan_8x8_frame_neon, zigzag_scan_8x8_field_neon, zigzag_sub_8x8_field_neon, zigzag_sub_8x8_frame_neon 4-5 times faster zigzag_interleave_8x8_cavlc_neon 6 times faster commit d040d28514db7d1fbd5c3f06c37a77de14b15e5b [revision 2499] Author: Janne Grunau Date: Fri Jul 25 11:53:17 2014 +0100 aarch64: implement x264_pixel_sa8d_satd_16x16_neon ~20% faster than calling pixel_sa8d_16x16 and pixel_satd_16x16 separately. commit 91a01d4ca95ee1c621578e118b86d767eab96b3b [revision 2498] Author: Janne Grunau Date: Thu Aug 14 23:13:27 2014 +0200 aarch64: optimize x264_predict_8x8c_dc_left_neon 25% faster than the previous version. commit 8ae4e1cfa3d16451ccf285228d309f6f4940a747 [revision 2497] Author: Henrik Gramner Date: Sat Aug 2 18:26:18 2014 +0200 x86: Make AVX2 also imply FMA3 All CPUs with AVX2 supports FMA3 (but not the other way around). commit 06882793b260824bc578d0530f64e7f30f2a9f39 [revision 2496] Author: Anton Mitrofanov Date: Thu Nov 13 22:52:00 2014 +0300 Simplify libx264 API usage example commit 6a301b6ee0ae8c78fb704e1cd86f4e861070f641 [revision 2495] Author: Henrik Gramner Date: Fri Nov 21 23:47:20 2014 +0100 AvxSynth: Remove a bunch of unused cruft commit 30140b34b879605cf70cab0634a4a8faef5b6e60 [revision 2494] Author: Anton Mitrofanov Date: Wed Dec 3 22:36:12 2014 +0300 Fix bugs/typos in motion compensation and cache_load Didn't affect output due to the incorrect values either not being used in the code path or producing equal results compared to the correct values. Also deduplicate hpel_ref arrays. commit a46820e00ad3c86b80f5830ed92553de474b7d5c [revision 2493] Author: Anton Mitrofanov Date: Sun Nov 30 23:39:28 2014 +0300 checkasm: Fix undefined behavior warnings commit 4e97ca566fdf6cd36281e26ee68f64993f4751a1 [revision 2492] Author: Henrik Gramner Date: Sat Nov 29 18:47:52 2014 +0100 checkasm: Fix V210 reporting It would previously report FAILED if any of the earlier plane_copy tests failed. commit 24e4fed388fcb34c33df7c87e7d6758b9ebed40c [revision 2491] Author: Anton Mitrofanov Date: Sun Oct 12 21:01:53 2014 +0400 Safety check against malicious high bit-depth input which could cause crash commit 9bec6fed6d1b95f9921f22ba21e7398eff50b75e [revision 2490] Author: Anton Mitrofanov Date: Sun Oct 12 20:45:40 2014 +0400 libx264 API usage example commit 329fe5f6498be7ab337d98ac22c17d379335c854 [revision 2489] Author: Henrik Gramner Date: Fri Oct 17 21:35:42 2014 +0200 x86: AVX2 high bit-depth var_16x16 40->27 cycles on Haswell. commit 4576cfd8c391b27748d6f97f5b621cec4ed8047c [revision 2488] Author: Henrik Gramner Date: Wed Oct 8 22:25:35 2014 +0200 checkasm: Serialize read_time() calls on x86 Improves the accuracy of benchmarks, especially in short functions. To quote the Intel 64 and IA-32 Architectures Software Developer's Manual: "The RDTSC instruction is not a serializing instruction. It does not necessarily wait until all previous instructions have been executed before reading the counter. Similarly, subsequent instructions may begin execution before the read operation is performed. If software requires RDTSC to be executed only after all previous instructions have completed locally, it can either use RDTSCP (if the processor supports that instruction) or execute the sequence LFENCE;RDTSC." RDTSCP would accomplish the same task, but it's only available since Nehalem. This change makes SSE2 a requirement to run checkasm. commit b85a74a22f79c8722674c4cfd7cddf5f54c8421d [revision 2487] Author: Vittorio Giovara Date: Mon Sep 29 18:51:30 2014 +0100 Support case-independent string options commit 20f116b29e93574e9607d1abf2960f32b5730e52 [revision 2486] Author: Anton Mitrofanov Date: Sat Sep 6 20:44:49 2014 +0400 Shut up gcc -Wuninitialized warnings commit 3df1d248dd8a4b0d0dffd149effe2bde38de49aa [revision 2485] Author: Anton Mitrofanov Date: Fri Sep 5 19:43:52 2014 +0400 Shut up clang -Wuninitialized warning commit 01204b60367f4959e8393652dd30f0cfba2d2c80 [revision 2484] Author: Anton Mitrofanov Date: Fri Sep 5 19:30:47 2014 +0400 Fix few clang -Wunused-* warnings commit 9df377f87702c82a2202d34919c07e32c60b40ae [revision 2483] Author: Anton Mitrofanov Date: Thu Aug 28 20:13:13 2014 +0400 Fix inappropriate instruction use commit 73b8686fc22c9247d90963983d406cd7b9131068 [revision 2482] Author: Anton Mitrofanov Date: Thu Aug 28 18:38:53 2014 +0400 x264asm: warn when inappropriate instruction used in function with specified cpuflags commit 204a9bd0a1bc507cbd69a77f3318afcb56ede65d [revision 2481] Author: Anton Mitrofanov Date: Tue Sep 2 01:48:00 2014 +0400 Fix VBV with true VFR streams commit b36d44c68cddff00c5b6de1e6cb6a86c1af2cbfc [revision 2480] Author: Anton Mitrofanov Date: Mon Sep 1 22:45:00 2014 +0400 Fix VBV commit dd79a61e0e354a432907f2d1f7137b27a12dfce7 [revision 2479] Author: Anton Mitrofanov Date: Wed Jul 30 03:03:32 2014 +0400 Update to the current lavf API and fix memory leak when using --seek commit 91727d729a4a33a3f21188f838077040740cb353 [revision 2478] Author: Henrik Gramner Date: Tue Aug 5 01:42:55 2014 +0200 x86inc: Make INIT_CPUFLAGS support an arbitrary number of cpuflags Previously there was a limit of two cpuflags. commit d4317786b8428b00978459f6de3db219f0f6f8e6 [revision 2477] Author: Henrik Gramner Date: Tue Aug 5 01:42:51 2014 +0200 x86: Minor pixel_ssim_end4 improvements Reduce the number of vector registers used from 7 to 5. Eliminate some moves in the AVX implementation. Avoid bypass delays for transitioning between int and float domains. commit 98100b88b475227f375d9bcbaea0bac57008accc [revision 2476] Author: Henrik Gramner Date: Tue Aug 5 01:42:47 2014 +0200 x86: Faster quant_4x4x4 Also drop the MMX version instead of doing a bunch of ifdeffery to support it after this change. commit 56fcb444c4c118ff67cf12838d2b2801d7b43407 [revision 2475] Author: Anton Mitrofanov Date: Sun Aug 10 22:46:12 2014 +0400 configure: improve cc_check for clang and ICL to not ignore unknown options commit ecb04d08af654a7cfd5b9aa6261bd789de20613a [revision 2474] Author: Henrik Gramner Date: Tue Aug 5 01:42:44 2014 +0200 checkasm: Only call x264_cpu_detect() once commit 1343db872b1d7d43dc7fb431a8207efb5ca31e2e [revision 2473] Author: Janne Grunau Date: Fri Jul 18 14:49:10 2014 +0100 aarch64: deblocking NEON asm Deblock chroma/luma are based on libav's h264 aarch64 NEON deblocking filter which was ported by me from the existing ARM NEON asm. No additional persons to ask for a relicense. commit 3c1fa5d9b2ea62f05473080313c543b7e795b307 [revision 2472] Author: Janne Grunau Date: Fri Jul 18 09:29:35 2014 +0100 aarch64: intra predition NEON asm Ported from the ARM NEON asm. commit 556b0e7928d14818454e0c33032754f6323f02e9 [revision 2471] Author: Janne Grunau Date: Thu Jul 17 15:58:44 2014 +0100 aarch64: motion compensation NEON asm Ported from the ARM NEON asm. commit 6cda439867fcd9e884a10502845fb79fc7ffed69 [revision 2470] Author: Janne Grunau Date: Wed Jul 16 10:03:52 2014 +0100 aarch64: transform and zigzag NEON asm Ported from the ARM NEON asm. commit db5c504aa06550f8e916157d1dcc657818e84d62 [revision 2469] Author: Janne Grunau Date: Tue Jul 15 12:57:03 2014 +0100 aarch64: quantization and level-run NEON asm Ported from the ARM NEON asm. commit f4a82a54885f3dad7106a6855eaef50ea085b27e [revision 2468] Author: Janne Grunau Date: Wed Mar 19 13:48:21 2014 +0100 aarch64: pixel metrics NEON asm Ported from the ARM NEON asm. commit 3e57554ee4db6ade7a2dccaac92cb8116f3a43d6 [revision 2467] Author: Janne Grunau Date: Fri Jul 18 17:44:57 2014 +0200 aarch64: add utility functions for asm commit efaf0b88f7c703533ee8857a6a5039cf64bce3a0 [revision 2466] Author: Janne Grunau Date: Wed Mar 19 13:45:17 2014 +0100 aarch64: add armv8 and neon cpu flags and test them commit 943128a527d1b98a63017d58cd1fcf53aaffcb6e [revision 2465] Author: Janne Grunau Date: Tue Mar 18 22:10:24 2014 +0100 aarch64: initial build support commit ee427b69868d506182f4e22bffdc45e913f255af [revision 2464] Author: Janne Grunau Date: Tue Jul 22 19:28:27 2014 +0200 checkasm: test zigzag_sub_8x8_{frame,field} commit 69740fd362ee1c0a2e80d6f4e2724d731a3c951c [revision 2463] Author: Janne Grunau Date: Sun Jul 20 18:29:01 2014 +0200 arm: use long multiplication in mc_weight_w*_neon 9-19% faster on a cortex-a9. commit 0a05b3f9aa8c524a67119ec5eb6bcc24eb8f2f3b [revision 2462] Author: Janne Grunau Date: Sun Jul 20 18:24:57 2014 +0200 arm: do not use aligned stores in mc_weight_w4_*neon mc_weight_w4_*neon is also used for width 2 which does not guarantee 4-byte aligned destination. Fixes crashes caused by random memory corruption. commit c2df1fc65c98e213c444134d5dbbb79d439af4db [revision 2461] Author: Janne Grunau Date: Wed Apr 2 16:31:28 2014 +0200 checkasm: add memory clobber to read_time inline asm The memory acts as compiler barrier preventing aggressive reordering of read_time calls. gcc 4.8 reorders some of initial read_time calls after the second when targeting arm. commit d72760401cb0602b8bf86037988e66cdc810681c [revision 2460] Author: Janne Grunau Date: Sun Jul 20 13:32:10 2014 +0200 arm: check if the assembler supports the '.func' directive The integrated assembler in llvm trunk (to be released as 3.5) is otherwise capable enough to assemble the arm asm correctly. commit 9463ec0004f1bddc49c05ed8e38430a4ce1738fb [revision 2459] Author: Janne Grunau Date: Sun Jul 20 13:40:28 2014 +0200 arm/ppc: use $CC as default assembler commit feec4a478bfdfb4426268b2ee79bac473b97488c [revision 2458] Author: Janne Grunau Date: Sun Jul 20 13:34:27 2014 +0200 arm: move instructions after '.rept' to separate line The gas manual states "Repeat the sequence of lines between the .rept directive and the next .endr directive ...". GNU as seems to support instructions on the same line as .rept anyway but the integrated assembler in llvm trunk (to be released 3.5 in August 2014) does not. commit 6e8971021d2a12505cb2ad9ea677dfc8af676919 [revision 2457] Author: Janne Grunau Date: Sun Jul 20 13:08:17 2014 +0200 arm: set .arch/.fpu from asm.S commit 716ee56d0b35e512e8e0ae1a3e71f26e65e86be3 [revision 2456] Author: Janne Grunau Date: Sun Jul 20 12:55:53 2014 +0200 arm: do not append CFLAGS to ASFLAGS commit 021c0dc6c95c1bc239c9db78a80dd85fc856a4dd [revision 2455] Author: Tristan Matthews Date: Thu Jul 17 00:03:50 2014 -0400 filters: fix sizeof mismatch commit 95beb822e61a8d84dba9743f4b20b4c303f26798 [revision 2454] Author: Anton Mitrofanov Date: Thu Jul 31 16:17:32 2014 +0400 Fix memory leak when using select_every filter commit ea0ca51e94323318b95bd8b27b7f9438cdcf4d9e [revision 2453] Author: Tsukasa OMOTO Date: Sun Jul 20 22:17:11 2014 +0900 Fix cltostr.sh on OS X commit 08d36b3fc975d049aa3786ca34fb0b2f2ba0007c [revision 2452] Author: Fiona Glaser Date: Wed Jul 9 12:21:33 2014 -0700 Check pf_log is set in validate_parameters Help remind people to call x264_param_default in case they didn't read the documentation. commit 9e93d18b7fe7668f8277b5f117d7e39be24c6070 [revision 2451] Author: Anton Mitrofanov Date: Wed Jul 9 17:17:04 2014 +0400 Check malloc during frame dumping commit 8a85db879d57537f91a9908be3585512981c08b8 [revision 2450] Author: Yusuke Nakamura Date: Thu Jun 19 05:21:29 2014 +0900 mp4_lsmash: Use new I/O API instead of deprecated one. commit f112c0e1cae71eb5b98b4f86f635f235cc7b81cb [revision 2449] Author: Anton Mitrofanov Date: Sun Jun 8 22:19:46 2014 +0400 Remove meaningless use of abs() commit 6fbbb5b0c05a1d95cbd6efa7f01808ea87a39dc9 [revision 2448] Author: Steven Walters Date: Sat May 31 10:31:16 2014 -0400 MSVS 2013 Update 2 support The first MSVS compiler C99 compliant enough to build x264. Use `CC=cl ./configure` to compile with it. commit f53af048ed94551734265cf8f9dbe12d211a77fc [revision 2447] Author: Diego Biurrun Date: Tue Apr 15 22:54:08 2014 +0200 configure: Add -Wno-maybe-uninitialized to CFLAGS The warnings generated by -Wmaybe-uninitialized are mostly spurious. commit cbd8d7b6db1f29929d1ad347e15afe7828ad7055 [revision 2446] Author: Diego Biurrun Date: Wed May 7 13:20:43 2014 +0200 build: Replace cltostr.pl by a shell script This avoids a dependency on Perl to build OpenCL support. commit d8b6ce7f703d3c9d83dbd4e8ef44cfabc7e2f78e [revision 2445] Author: Diego Biurrun Date: Tue Apr 15 23:02:39 2014 +0200 build: Simplify phony target declaration with wildcards Also add etags to list of phony targets. commit 2bd932fdf053faace84028a66d8ba9e17d526456 [revision 2444] Author: Diego Biurrun Date: Wed May 7 12:47:37 2014 +0200 configure: Drop workaround for obsolete gcc 4.2 on ARM commit 31311f254971e1da51d817cb580fc4fe1f4d5f20 [revision 2443] Author: Diego Biurrun Date: Wed May 7 21:43:15 2014 +0200 build: Add dependencies on x86inc.asm/x86util.asm for all .asm files This is a little bit overzealous, but errs on the side of caution. Generating full dependency information is also possible, but slightly slows down the build as YASM cannot do it as a sideeffect of compilation. commit 016831ec7b3a4a7062908243dbde62d7d89b334e [revision 2442] Author: Diego Biurrun Date: Sun Apr 27 21:09:54 2014 +0200 Delete all SPARC optimizations SPARC has been obsolete for a long time and makes little sense as a H.264 encoding platform. Also update authors file. commit c7c8eb15923d1888bb87e7642a66b417fab61e76 [revision 2441] Author: Diego Biurrun Date: Wed May 7 12:46:42 2014 +0200 configure: Don't check for libavcore libavcore was a never-released bad idea with a short lifespan. commit dd5b5d3959e35c122c7709a9823a26b589c950da [revision 2440] Author: Diego Biurrun Date: Sun Apr 27 23:19:04 2014 +0200 build: Set all ASFLAGS from within configure This is how all other toolchain flags are handled. commit c15f20bd772487d863f01a2813a3ab45b1f11a6b [revision 2439] Author: Diego Biurrun Date: Sun Apr 27 23:23:49 2014 +0200 opencl: Check return value of fread() common/opencl.c:138:10: warning: ignoring return value of 'fread', declared with attribute warn_unused_result [-Wunused-result] commit af8e768e2bd3b4398bca033998f83b0eb8874914 [revision 2438] Author: Fiona Glaser Date: Sat Jul 19 20:34:22 2014 -0700 Disable i8x8 in lossless x264's implementation was slightly incorrect due to a vague spec, so some decoders decoded video incorrectly. Minimal impact on compression. commit 450cf7ae2592ee0cb474bcefedf90c9911605e26 [revision 2437] Author: Thomas Mundt Date: Fri Jun 27 11:12:06 2014 -0700 AVC-Intra: fix compatibility with Avid Transfermanager commit 6eb483e4ca23f34a6a8fe09f3f2e9c9f192fd76b [revision 2436] Author: Henrik Gramner Date: Tue Jul 8 21:15:32 2014 +0200 x86: Fix SIGILL in high bit-depth intra_sad_x3_4x4_sse2 An SSE3 instruction was used in an SSE2 function. commit 5e58ce7a8b39ab66c7d6420b85a8e09dd08dfaaf [revision 2435] Author: Anton Mitrofanov Date: Wed Jul 9 17:01:54 2014 +0400 Fix incorrect row predictor addressing Somehow managed to not cause things to explode, but was clearly incorrect. Might improve VBV in some cases to have this working right. commit 3fda920e6f1e4a8f76680c001962542866408114 [revision 2434] Author: Anton Mitrofanov Date: Sat Jun 21 23:52:39 2014 +0400 Fix b-pyramid MMCO remove for frame-packing==5 commit 92fdb43dd47fbc3368d9d9c7ad940fbe03657bd3 [revision 2433] Author: Tal Aloni Date: Tue Jun 17 15:10:56 2014 -0700 Fix frame-packing==5 with some decoders The spec mandates that frame-packing==5 requires the SEI on every frame that begins a view sequence (i.e. the input frames L0-R0-L1-R1 have 4 view sequences, but if reordered by the encoder to L0-L1-R0-R1 there are now 2 view sequences). For simplicity, we write the SEI on every frame. This fixes frame-packing==5 3D playback on some decoders (PlayStation 3, Sony W8 series, possibly others). commit 13d6dfd83af98e472a9e9a8b6abf5c971707a893 [revision 2432] Author: Anton Mitrofanov Date: Thu May 22 13:27:00 2014 +0400 Fix pixel_ssim_end4 asm function for x86_64 systems commit a5831aa256b3161f898d2577d2eb8daa838d88d2 [revision 2431] Author: James Almer Date: Wed Apr 9 03:33:06 2014 -0300 x86: XOP pixel_sad_{x3, x4} high bit-depth commit 0d989a4ff3298f9e495be452880b5f9bfb441e93 [revision 2430] Author: James Almer Date: Wed Apr 9 03:33:05 2014 -0300 x86: XOP pixel_ssd_nv12_core commit 9b77dffab04e3ea242598454282b40800e720353 [revision 2429] Author: James Almer Date: Wed Apr 9 03:33:04 2014 -0300 x86util: XOP optimized HADDD commit 1e517399f76b12fe2e73892970fe3aac01a178f8 [revision 2428] Author: James Almer Date: Wed Apr 9 03:33:03 2014 -0300 x86: add missing initialization for high bit-depth sa8d_satd commit aa00925abd6f9ab4e20216ae5a5ad79b67756162 [revision 2427] Author: James Almer Date: Sat Apr 5 23:46:31 2014 -0300 x86: add missing initializations for high bit-depth variance commit fadc4045f91ca78c046f301cba6065732b5d27ea [revision 2426] Author: Janne Grunau Date: Tue Apr 1 22:11:45 2014 +0200 arm: use the weight_fn_t typedef for mc weight function arrays commit 644c396be97c1e6ace144f8be04afab19fb238af [revision 2425] Author: Janne Grunau Date: Tue Apr 1 22:11:44 2014 +0200 arm: correct x264_mc_chroma_neon function declaration commit b2e9ca30f1e9ac25df1f592db04ff0d91faf42d4 [revision 2424] Author: Janne Grunau Date: Tue Apr 1 22:11:43 2014 +0200 arm: do not export every asm function Based on Libav's libavutil/arm/asm.S. Also prevents having the same label twice for every function on systems not defining EXTERN_ASM. Clang's integrated assembler does not like it. commit ceb1484da34b7492f539b535a930652690372fe5 [revision 2423] Author: Janne Grunau Date: Tue Apr 1 22:11:42 2014 +0200 arm: move all .macro/.endm to column 0 commit 24ab0e75db887c2b1a412d00878810ed6501061e [revision 2422] Author: William Grant Date: Sun Mar 23 09:21:52 2014 -0700 aarch64: require PIC in shared mode commit 435722c9c1870cd54fdb89be39250d492aecb598 [revision 2421] Author: Janne Grunau Date: Sun Mar 16 17:21:58 2014 +0100 arm: x264_coeff_last8_arm checkasm --bench on a coretex-a9: coeff_last8_c: 173 coeff_last8_armv6: 151 60 instead of 73 cycles in ~130k runs on the same cpu while encoding. commit 2e96c571b8c324304b3d4fbb7914143518349213 [revision 2420] Author: Janne Grunau Date: Sat Mar 15 20:09:18 2014 +0100 arm: x264_store_interleave_chroma_neon store_interleave_chroma_c: 4036 store_interleave_chroma_neon: 1043 commit 1576e51e52148ad1e1d8b5e76562f9eae8d47e6e [revision 2419] Author: Janne Grunau Date: Sat Mar 15 19:55:50 2014 +0100 arm: x264_plane_copy_interleave_neon plane_copy_interleave_c: 40285 plane_copy_interleave_neon: 10137 commit 0016dec27080e53c794d7f919bd6df6b890d0128 [revision 2418] Author: Janne Grunau Date: Sat Mar 15 19:21:12 2014 +0100 arm: x264_plane_copy_deinterleave_rgb_neon plane_copy_deinterleave_rgb_c: 31543 plane_copy_deinterleave_rgb_neon: 8312 commit 5e0ca9aa4eab5e2cb4b124774c3ecebbc6f1ae35 [revision 2417] Author: Janne Grunau Date: Sat Mar 15 18:22:49 2014 +0100 arm: load_deinterleave_chroma_f{dec,enc}_neon load_deinterleave_chroma_fdec_c: 4055 load_deinterleave_chroma_fdec_neon: 995 load_deinterleave_chroma_fenc_c: 4071 load_deinterleave_chroma_fenc_neon: 992 commit c9a5ae0d219b6a28adebdb83faf89f291611f57b [revision 2416] Author: Janne Grunau Date: Sat Mar 15 17:22:08 2014 +0100 arm: x264_plane_copy_deinterleave_neon plane_copy_deinterleave_c: 42988 plane_copy_deinterleave_neon: 10184 commit c570be3ea9f24942c362e1c2402ec7fccbb5c330 [revision 2415] Author: Janne Grunau Date: Sat Mar 15 13:29:41 2014 +0100 arm: implement deblock_strength_neon Based on deblock_strength_avx. checkasm --bench on a cortex-a9: deblock_strength_c: 14611 deblock_strength_neon: 1848 commit 2794ba5bb0007e0edf32d5325ca82cbf654f79b0 [revision 2414] Author: Janne Grunau Date: Sat Mar 15 10:51:11 2014 +0100 arm: add missing macro instantiation for x264_pixel_avg_4x16_neon checkasm --bench on a cortex-a9: avg_4x16_c: 8910 avg_4x16_neon: 2091 commit d6002ebace8194d17ee0ba607ff82c4f9075dd2d [revision 2413] Author: Janne Grunau Date: Thu Mar 13 01:02:13 2014 +0100 arm: implement x264_predict_4x4_v_armv6 Alone probably not worth it but allows use of predict_4x4_dc|h_armv6 in intra_sad|satd_x3_4x4_neon. commit d7e689680023e327de7e052e01e7faee30135799 [revision 2412] Author: Roland Stigge Date: Sun Mar 23 09:29:37 2014 -0700 ppc: fix build on certain PowerPC variants without Altivec commit 863ea2a224cf7380c7a6ea9ae531e16b621cc0b7 [revision 2411] Author: Anton Mitrofanov Date: Tue Apr 22 00:58:24 2014 +0400 Only add strip option '-s' for linker flags Fixes some build warnings with clang. commit 4102614df9a11d66b506fb435132ddd0f88c6f94 [revision 2410] Author: Tsukasa OMOTO Date: Sat Mar 15 16:53:53 2014 +0900 configure: remove an unnecessary option from CFLAGS on OS X Fixes Clang 3.4 compilation on OS X. commit b3fb718404d6cce9c82987ea2909cda5072d040c [revision 2409] Author: Fiona Glaser Date: Sun Feb 23 10:36:55 2014 -0800 Macroblock tree overhaul/optimization Move the second core part of macroblock tree into an assembly function; SIMD-optimize roughly half of it (for x86). Roughly ~25-65% faster mbtree, depending on content. Slightly change how mbtree handles the tradeoff between range and precision for propagation. Overall a slight (but mostly negligible) effect on SSIM and ~2% faster. commit 00a00ccab316de3d50da6a82ba4af44dcb4655ec [revision 2408] Author: Janne Grunau Date: Thu Mar 13 00:05:48 2014 +0100 arm: use available neon functions for intra_sa8d/sad/satd_x3 4% faster on main/medium, 15% faster on baseline/superfast on a cortex-a9. commit ac8f2e8a4cf21b2026957509bea8865ff7879fb4 [revision 2407] Author: Janne Grunau Date: Wed Mar 12 14:35:31 2014 +0100 arm: implement x264_pixel_var2_8x16_neon checkasm --bench on a cortex-a9: var2_8x16_c: 5677 var2_8x16_neon: 1421 commit 66836125beabdaff561da89ea1e18e566f5d202a [revision 2406] Author: Janne Grunau Date: Wed Mar 12 13:16:00 2014 +0100 arm: implement x264_pixel_var_8x16_neon checkasm --bench on a cortex-a9: var_8x16_c: 4306 var_8x16_neon: 791 commit a90ea34cf264d6b7733c5ffbe6d46882c306b50f [revision 2405] Author: Henrik Gramner Date: Sun Feb 23 15:33:48 2014 +0100 x86: SSE2 and SSSE3 plane_copy_deinterleave_rgb About 5.6x faster than C on Haswell. commit f032147ca69401165495a36cf7aba5b8c95ecb3b [revision 2404] Author: Henrik Gramner Date: Sun Feb 16 21:24:54 2014 +0100 x86: Minor mbtree_propagate_cost improvements Reduce the number of registers used from 7 to 6. Reduce the number of vector registers used by the AVX2 implementation from 8 to 7. Multiply fps_factor by 1/256 once per frame instead of once per macroblock row. Use mova instead of movu for dst since it's guaranteed to be aligned. Some cosmetics. commit 7c860f075ccd14fb7891d5fc6c9eab1a37ea555d [revision 2403] Author: Henrik Gramner Date: Sun Feb 9 23:58:04 2014 +0100 x86inc: Support arbitrary stack alignments If the stack is known to be at least 32-byte aligned we can safely store ymm registers on the stack without doing manual alignment. Change ALLOC_STACK to always align the stack before allocating stack space for consistency. Previously alignment would occur either before or after allocating stack space depending on whether manual alignment was required or not. commit 039fab9203179f9e790abfd54ae5b2254ef803e7 [revision 2402] Author: Anton Mitrofanov Date: Fri Feb 14 15:53:58 2014 +0400 x86inc: warn if XOP integer FMA instruction emulation is impossible Emulation requires a temporary register if arguments 1 and 4 are the same; this doesn't obey the semantics of the original instruction, so we can't emulate that in x86inc. ffmpeg has an x86util emulation for that case; I'll add it if x264's asm ever needs it. Also add pmacsdql emulation. commit 974f2e78e0cb25e06fedbcfef70f80938f22988b [revision 2401] Author: Loren Merritt Date: Sat Mar 1 02:57:56 2014 +0000 x86inc: free up variable name "n" in global namespace commit 8596dd36df38d33d402e848035b1bd31edc2c389 [revision 2400] Author: Henrik Gramner Date: Wed Jan 22 19:09:12 2014 +0100 x86: Pass -Worphan-labels to yasm Makes it easier to detect typos. commit 0bb3b2edb866dd852bb1f5faed88df4bdcf0c16f [revision 2399] Author: Steve Lhomme Date: Sun Feb 16 13:15:09 2014 +0100 Write 3D metadata when outputting Matroska For when --frame-packing is set. commit f35e3fc26b99e1b3c943c131100fdfa4733fc932 [revision 2398] Author: Anton Mitrofanov Date: Sun Feb 23 16:56:03 2014 +0400 Don't set chroma_loc_info_present_flag for non-4:2:0 The H.264 spec says it shouldn't be set in these cases. commit b7a50c16414631c8ff5e417da51b190c8999027e [revision 2397] Author: Fiona Glaser Date: Mon Mar 10 08:42:50 2014 -0700 x264.h: fix documentation The full details of the return values of encoder_encode and encoder_headers were mistakenly removed a while ago; re-add them. commit de01d8821b59b85a01c8a89e544e0fed6488b958 [revision 2396] Author: Anton Mitrofanov Date: Sun Feb 23 15:52:57 2014 +0400 Fix pointer cast warning for 64-bit builds commit 8b821ec19ba9425c120b8986a57ca7c6b9f088ed [revision 2395] Author: Anton Mitrofanov Date: Mon Mar 10 16:48:02 2014 +0400 mbaff: fix mb_field_decoding_flag tracking and simplify allow skip check Fixes an issue with too many forced non-skips in mbaff+cavlc, as well as non-deterministic output with mbaff+cavlc+sliced-threads. commit 850c8c5d6139df82e969d2174eebba69b479aa16 [revision 2394] Author: Anton Mitrofanov Date: Mon Mar 10 03:22:57 2014 +0400 Fix memory overwrite in x264_deblock_h_chroma_mbaff_sse2 Fixes possible corruption with MBAFF+sliced threads. commit 19dddbcff73541ae15f8e57383ff1c6aa907d99d [revision 2393] Author: Fiona Glaser Date: Sun Mar 2 10:09:01 2014 -0800 Fix corruption with CAVLC overflow handling in MBAFF+main profile Probably a regression in r2178. commit 48dbfa28201950f7e07e96a7d62b2951dd2dbe03 [revision 2392] Author: Anton Mitrofanov Date: Mon Mar 10 21:17:19 2014 +0400 Fix checkasm --bench output when nop_cycles is too large commit ee8d5e4b51da99e576b5aea3008e70d1c7ed2372 [revision 2391] Author: Anton Mitrofanov Date: Wed Jan 22 12:54:49 2014 +0400 Really fix quantization factor allocation Actually allocate less (instead of just initialize less) and fix comments. commit 0d668be8d7525992c1c163c97551ee897e43c177 [revision 2390] Author: Yu Xiaolei Date: Sun Feb 23 04:12:51 2014 -0800 Fix build with Android NDK Android NDK does not expose sched_getaffinity. commit 42d25196d423626c12794db3f66322c7a3f4375e [revision 2389] Author: Loren Merritt Date: Thu Jan 16 13:34:46 2014 -0800 x86inc: speed up compilation with yasm Work around yasm's inefficiency with handling large numbers of variables in the global scope. commit dd6a303498d1f55c73037ed925a6ece8e28a95bc [revision 2388] Author: Kieran Kunhya Date: Fri Jan 10 23:27:33 2014 +0000 Add support for AVC-Intra Class 200 commit 41227fa2531d9263e481b80237d2d9ef6f5a450f [revision 2387] Author: James Weaver Date: Tue Jan 7 10:31:58 2014 +0000 v210 input support Assembly based on code by Henrik Gramner and Loren Merritt. commit e2a9662751180b7dd2fe538913282ee800445445 [revision 2386] Author: Fiona Glaser Date: Tue Jan 21 13:39:33 2014 -0800 Fix quantization factor allocation We don't need to wastefully allocate quant tables above QP_MAX_SPEC; they're never used. commit 8be6600d10a74ca241dbb27e096883ceed7b4082 [revision 2385] Author: Henrik Gramner Date: Wed Jan 8 01:06:56 2014 +0100 Avoid some unneccesary memory loads in macroblock_encode commit 807aeaaae7351e4c2c536463e69dacaac218bccb [revision 2384] Author: Henrik Gramner Date: Sun Jan 5 15:25:05 2014 +0100 Bump dates to 2014 Also update AUTHORS file and my e-mail address in the headers of various files. commit 02697d57d987f8d51a5c3ced5e5b81d7137012ee [revision 2383] Author: Henrik Gramner Date: Mon Jan 6 00:18:31 2014 +0100 Remove tools/xyuv.c It's an old stand-alone application that isn't relevant to x264. commit 7664014b2b490d81a66f2a13138182dfaaf4be06 [revision 2382] Author: Anton Mitrofanov Date: Thu Nov 7 02:37:23 2013 +0400 Use 8x16c wrappers with x86 asm functions for 4:2:2 with high bit depth commit 6bc63417e10e135d8cd881495c71be72d322e1d3 [revision 2381] Author: Henrik Gramner Date: Fri Dec 20 22:44:28 2013 +0100 CLI: Avoid redundant 16-bit upconversions in piped raw input It's not possible to seek in pipes, so if we want to skip frames we have to read and discard unused ones. It's pointless to do bit-depth upconversions in those frames. commit 008c56ec467736bc5d3130ff890c618d28aa7511 [revision 2380] Author: Anton Mitrofanov Date: Fri Jan 3 20:06:06 2014 +0400 Fix input support from named pipes in Windows commit 91481419e3acc4bb601600cf32e46e7f93ae02ab [revision 2379] Author: Steve Clark Date: Wed Nov 20 21:40:23 2013 +0400 Fix ARM asm compilation with Apple assembler commit a2f5d600bf866899db92e2dae40eb9fe46d44ade [revision 2378] Author: Anton Mitrofanov Date: Wed Nov 13 19:24:48 2013 +0400 Fix uninitialized variable Caused if the timebase is not specified in stats file. Found by Clang. commit 95d196ef2edde109cfb32f4baa9b0adc67e842e1 [revision 2377] Author: Anton Mitrofanov Date: Sun Oct 27 19:27:23 2013 +0400 Remove --visualize option. It probably wasn't used or maintained for last few years. commit 09c7010e3d13e66a241c0529b36ae3f7e1664ff4 [revision 2376] Author: Anton Mitrofanov Date: Tue Oct 15 12:32:25 2013 +0400 Add L-SMASH support as preferable alternative for MP4-muxing commit c9f2bceb1f37aeaf6b7ed730f0fd210ef8725cab [revision 2375] Author: Kieran Kunhya Date: Sat Sep 21 19:16:12 2013 +0100 Add AVC-Intra 1080p50/60 Class 100 parameters Also add some compatibility fixes. commit c084f6c029f016cf2024a2fc511825e82fb95865 [revision 2374] Author: Fiona Glaser Date: Mon Sep 9 12:37:59 2013 -0700 Add --filler option Allows generation of hard-CBR streams without using NAL HRD. Useful if you want to be able to reconfigure the bitrate (which you can't do with NAL HRD on). commit 350b214c5abe7e82618ac46a14f23b7ab543045e [revision 2373] Author: Anton Mitrofanov Date: Sun Oct 27 15:22:51 2013 +0400 Make x264_encoder_reconfig more threadsafe Do the reconfig when the next frame's encode begins. Fixes some rare crashes with frame-threading and encoder_reconfig. commit 77cc44feea75106fae6d3113f6babbbe8cffba87 [revision 2372] Author: Fiona Glaser Date: Thu Oct 24 17:19:00 2013 -0700 chroma-me: take shortcut in BI analysis ~100 cycles faster with subme>=9 commit 7634f8c6047e9e12036778a8dc8d4cd4b06eebcb [revision 2371] Author: Fiona Glaser Date: Thu Oct 24 14:44:43 2013 -0700 CRF-max: don't warn if VBV underflow occurs Only warn if underflow occurs for reasons other than CRF-max, as CRF-max implies that VBV underflow is desired by the user. commit 4b68633dc375fc372f160a3ae669a32e519b285a [revision 2370] Author: Henrik Gramner Date: Fri Oct 18 22:43:36 2013 +0200 x86inc: Make ym# behave the same way as xm# This makes more sense for future implementations of templates with zmm registers. commit b54422a858809f39c00fac46207bfa8ad16cdb28 [revision 2369] Author: Henrik Gramner Date: Fri Oct 18 22:21:38 2013 +0200 Use calloc instead of malloc + memset commit 8b58a4ce52047b00f5892a9cdd92f9695a50a933 [revision 2368] Author: Henrik Gramner Date: Thu Oct 10 16:54:12 2013 +0200 Replace gf_malloc with regular malloc in mp4 muxer It was used as a workaround for a bug that only existed in the GPAC repository for a few weeks back in 2010. There's no reason to keep it anymore. commit 05f04384a10cb673abea7749cd319971c0017769 [revision 2367] Author: Anton Mitrofanov Date: Tue Oct 8 23:20:40 2013 +0400 Update to current libav/ffmpeg API commit b7b6029f0e121b87fd96595b15d0c40fcd1b3bf1 [revision 2366] Author: Rafaël Carré Date: Fri Oct 25 07:12:24 2013 -0700 version.sh: change to use /bin/sh commit c3c73f13bb9ee60ccf40f85dbc11c91efac9d1e2 [revision 2365] Author: Sean McGovern Date: Wed Sep 4 14:15:00 2013 -0700 configure: don't generate a git version number if .git isn't present commit 12f9d499905199427a0196743c2cde56642d6d99 [revision 2364] Author: Martin Storsjo Date: Tue Sep 3 14:56:18 2013 -0700 configure: include dependency libs in the Libs pkg-config If only a static library is built, the user of the library that just tries to link to the lib using the flags provided by pkg-config might not know that only a static lib exists and that he'd have to pass --static to pkg-config to get the internal dependencies to be able to link the library. For a shared build, the internal dependencies are kept in Libs.private as before. This matches how libav's pkg-config files are generated. commit 03450be799dea03a83dad4dc833ef8ddd7f36b62 [revision 2363] Author: Anton Mitrofanov Date: Fri Oct 18 00:38:06 2013 +0400 Fix compilation in case of HAVE_LOG2F check fails spuriously commit 266fdfcd4809afb018e45ab959d4a56a42712c88 [revision 2362] Author: Anton Mitrofanov Date: Sat Oct 12 12:01:57 2013 +0400 Fix compilation of shared library for Windows with original MinGW toolchain commit 50a0c33b9b5fa57d0a129b7441a6af55f7a08005 [revision 2361] Author: Anton Mitrofanov Date: Tue Oct 8 23:32:37 2013 +0400 Fix possible crashes in resize and crop filters with high bitdepth input commit 5b272b22d8f7511a4abece5a23ad25282bedaea8 [revision 2360] Author: Tim Mooney Date: Tue Sep 3 13:43:50 2013 -0700 Fix INSTALL in configure for Solaris systems commit 2fd292391a4d41b9fc65ee652b4663fdd9f8107e [revision 2359] Author: Henrik Gramner Date: Wed Aug 28 00:50:31 2013 +0200 Workaround for FFMS indexing bug If FFMS_ReadIndex is used with an empty index file it gets stuck in an infinite loop instead of returning NULL like it's supposed to do on failure. Explicitly check if the file is empty before calling it as a workaround. commit 5bcff2a62c050376ca54c5e5040d0529c89eb9f2 [revision 2358] Author: Anton Mitrofanov Date: Mon Aug 26 21:20:31 2013 +0400 Fix masked access violation in KERNEL32 Caused crashes under gdb in Windows and might cause other unknown problems. commit 098b686e6397d5bb6b3a5c03cd918aa88216909f [revision 2357] Author: Hiroki Taniura Date: Sun Aug 25 01:18:57 2013 +0900 Fix GPAC support on Windows commit fa3cac516cb71b8ece09cedbfd0ce631ca8a2a4c [revision 2356] Author: Henrik Gramner Date: Sun Aug 11 19:50:42 2013 +0200 Windows Unicode support Windows, unlike most other operating systems, uses UTF-16 for Unicode strings while x264 is designed for UTF-8. This patch does the following in order to handle things like Unicode filenames: * Keep strings internally as UTF-8. * Retrieve the CLI command line as UTF-16 and convert it to UTF-8. * Always use Unicode versions of Windows API functions and convert strings to UTF-16 when calling them. * Attempt to use legacy 8.3 short filenames for external libraries without Unicode support. commit 9b94896b3735052cabb52d081de3b50020a077cb [revision 2355] Author: Kieran Kunhya Date: Sat Jul 20 18:47:59 2013 +0100 AVC-Intra support This format has been reverse engineered and x264's output has almost exactly the same bitstream as Panasonic cameras and encoders produce. It therefore does not comply with SMPTE RP2027 since Panasonic themselves do not comply with their own specification. It has been tested in Avid, Premiere, Edius and Quantel. Parts of this patch were written by Fiona Glaser and some reverse engineering was done by Joseph Artsimovich. commit fa1e2b746d95575b5c5b8e49fcfcad3ded9a5420 [revision 2354] Author: Henrik Gramner Date: Mon Jul 8 12:06:42 2013 -0700 Transparent hugepage support Combine frame and mb data mallocs into a single large malloc. Additionally, on Linux systems with hugepage support, ask for hugepages on large mallocs. This gives a small performance improvement (~0.2-0.9%) on systems without hugepage support, as well as a small memory footprint reduction. On recent Linux kernels with hugepage support enabled (set to madvise or always), it improves performance up to 4% at the cost of about 7-12% more memory usage on typical settings.. It may help even more on Haswell and other recent CPUs with improved 2MB page support in hardware. commit e33aac9aba5c6b9c867b92f14c7722152680a61a [revision 2353] Author: Henrik Gramner Date: Fri Jul 5 21:15:54 2013 +0200 x86: SSSE3 implementation of pixel_sad_x3 and pixel_sad_x4 commit 4becc3e9e031c4207698846369aac2bef1480d15 [revision 2352] Author: Henrik Gramner Date: Fri Jul 5 21:15:49 2013 +0200 x86: Faster AVX2 pixel_sad_x3 and pixel_sad_x4 commit 401edc3ab08f95777d495b38030e2108d7d3f0b4 [revision 2351] Author: Diogo Franco Date: Tue Jul 23 22:17:44 2013 -0300 configure: Support cygwin64 commit adc99d17d8c1fbc164fae8319b40d7c45f30314e [revision 2350] Author: Derek Buitenhuis Date: Fri Aug 9 13:39:27 2013 -0400 x86inc: Check for __OUTPUT_FORMAT__ having a value of "x64" This is also a valid value for WIN64. commit 1430b04988c3bb344e104c974ed3aa825035c0ec [revision 2349] Author: Anton Mitrofanov Date: Tue Jul 23 14:11:50 2013 -0700 Fix cases in which intra refresh allowed prediction from disallowed pixels commit a6c396f0fe01f453de115ba0d8c4aa26138aa6b4 [revision 2348] Author: Anton Mitrofanov Date: Wed Aug 7 01:56:34 2013 +0400 Fix a few minor bugs found with a static analyzer commit 2d66c7c2471801aa946517226739e9150f6c1948 [revision 2347] Author: Fiona Glaser Date: Fri Jul 12 16:07:35 2013 -0700 Fix AVX2 detection bug with "limit CPUID" enabled in BIOS commit ff41804efd4caec120fc9e1b90ad226035f75aaa [revision 2346] Author: Henrik Gramner Date: Fri Jul 5 21:15:43 2013 +0200 x86: Remove X264_CPU_SSE_MISALIGN functions Prevents a crash if the misaligned exception mask bit is cleared for some reason. Misaligned SSE functions are only used on AMD Phenom CPUs and the benefit is miniscule. They also require modifying the MXCSR control register and by removing those functions we can get rid of that complexity altogether. VEX-encoded instructions also supports unaligned memory operands. I tried adding AVX implementations of all removed functions but there were no performance improvements on Ivy Bridge. pixel_sad_x3 and pixel_sad_x4 had significant code size reductions though so I kept them and added some minor cosmetics fixes and tweaks. commit 01087fdbf2042095cb36458fd5c5efab3f4b3a37 [revision 2345] Author: Fiona Glaser Date: Thu Jun 20 15:51:39 2013 -0700 Tweak i16x16-delta-quant-avoidance code Don't omit the delta quant if it'd raise the quantizer to do so; this fixes a rare flickering issue caused by deblocking. commit bfa2f0c44cec2e41fbd7566edb55e405f6c5a49d [revision 2344] Author: Fiona Glaser Date: Sun Jun 9 09:06:27 2013 -0700 x86: faster AVX2 iDCT, AVX deblock_luma_h, deblock_luma_h_intra commit 397f60e7f23e2c6ec2cb9b168ebb75cc42983dd7 [revision 2343] Author: Lucien Date: Mon Jun 17 18:28:09 2013 +0000 Add new color primaries, transfer characteristics, matrix coefficients commit fa215fc9d77d131595e8b1eda0fc4e9da62c1f94 [revision 2342] Author: Fiona Glaser Date: Fri May 31 17:01:29 2013 -0700 Add "--stitchable" option for segmented encoding Stops x264 from attempting to optimize global stream headers, ensuring that different segments of a video will have identical headers when used with identical encoding settings. commit 9143d5ad966a3864597009ba1f1befe87328ec61 [revision 2341] Author: Fiona Glaser Date: Thu Jun 27 08:29:06 2013 -0700 Interface: if vbv-maxrate < bitrate, set bitrate = vbv-maxrate This probably makes more sense to the user than setting vbv-maxrate = bitrate, as before. commit 83d35c7bc4332e4dd27ba7b8baf96f8743c52a8b [revision 2340] Author: Anton Mitrofanov Date: Tue May 28 05:02:42 2013 -0700 OpenCL cosmetics commit ffc3ad4945da69f3caa2b40e4eed715a9a8d9526 [revision 2339] Author: Anton Mitrofanov Date: Tue Jun 18 00:16:33 2013 +0400 Fix possible crash when writing very large filler NALUs Bitstream-reallocation function didn't handle the case of filler. commit 25ef3f5fdbfca0f9a5ff8a97b8475e7f8b4c9202 [revision 2338] Author: Loren Merritt Date: Mon Jun 17 11:27:09 2013 -0700 Fix build with PIC on some systems commit c41b629d4831cde47a8c0cde435041cc3b996d85 [revision 2337] Author: Henrik Gramner Date: Sun Jun 2 18:41:17 2013 +0200 Fix potential misaligment crash in AVX2 denoise_dct commit e32d9c21339cbb021d6c9ad5897bfde09dcdb63a [revision 2336] Author: Anton Mitrofanov Date: Tue May 28 01:48:15 2013 +0400 Fix building with compilers without inline asm support Also fix crash in high bit depth builds compiled with unaligned stack. commit 3b8e924639ac67a4beb0ebe9b9663de03cdce84d [revision 2335] Author: Anton Mitrofanov Date: Wed May 22 22:43:59 2013 +0400 Fix compilation with OpenCL on MacOS X Also fix crash in the case of OpenCL error during encoding. commit 3aa9a67b6d62bcf11ee69397647230700a32044b [revision 2334] Author: Anton Mitrofanov Date: Mon May 6 22:51:11 2013 +0400 OpenCL support improvement/refactoring Autoload the OpenCL library so that it's not required to run an openCL-enabled build of x264. Update X264_BUILD, which should have been changed with the first patch. commit 0b2c3d35c168011e73300da5fdc690e00a8238e0 [revision 2333] Author: Fiona Glaser Date: Thu May 16 13:51:37 2013 -0700 x86: shave a few instructions off AVX deblock commit e7cb328580c3e1bd7604a64f40abf3e03c474771 [revision 2332] Author: Henrik Gramner Date: Tue May 14 18:57:40 2013 +0200 x86: AVX2 dequant_4x4_dc commit edf31ed3577f35e7ed3934dd74be474f9d22384a [revision 2331] Author: Henrik Gramner Date: Tue May 14 18:53:12 2013 +0200 x86: AVX2 high bit-depth dequant commit bc88d1bb331ee061c38bea80f7a54a76797c31d0 [revision 2330] Author: Fiona Glaser Date: Thu May 9 17:20:05 2013 -0700 x86-64: 64-bit variant of AVX2 hpel_filter ~5% faster than 32-bit. commit 89f067b7cacecf413569e84c6c973c23f67b1ad3 [revision 2329] Author: Henrik Gramner Date: Mon May 6 18:41:24 2013 +0200 x86: AVX2 high bit-depth denoise_dct 28->15 cycles Also reorder instructions to use fewer registers, 3 cycles faster on Ivy Bridge with 64-bit Windows. commit 481e4cdb52989e4b514a2f4345870a19c5c0ae92 [revision 2328] Author: Henrik Gramner Date: Sat May 4 18:48:58 2013 +0200 x86: AVX2 high bit-depth quant quant_4x4: 13->6 cycles quant_4x4_dc: 14->8 cycles quant_8x8: 47->24 cycles quant_4x4x4: 48->25 cycles commit 02aa1368da5c222c8833724abccddd8f02630fc6 [revision 2327] Author: Fiona Glaser Date: Wed May 1 14:32:11 2013 -0700 x86: AVX2 add16x16_idct_dc 27 -> 19 cycles commit 0c00c2c7882de130184e02cf1861599aedb425e8 [revision 2326] Author: Fiona Glaser Date: Mon Apr 29 16:16:54 2013 -0700 x86: faster AVX2 quant_4x4x4 10->9 cycles commit af6647e0e7d647c660003f65b78b4f1a0b186ec2 [revision 2325] Author: Fiona Glaser Date: Sat Apr 27 21:03:32 2013 -0700 x86: AVX2 intra_sad_x3_8x8c 30->22 cycles commit f114746df6ce6a1bcacf46c62b696cc309ab4527 [revision 2324] Author: Henrik Gramner Date: Sun Apr 28 11:11:03 2013 +0200 x86: AVX2 high bit-depth intra_sad_x3_8x8 43->24 cycles commit 8e4f045f815a59ca3d6398ff4ddae7af44766dc8 [revision 2323] Author: Fiona Glaser Date: Wed Apr 24 14:22:15 2013 -0700 x86: AVX2 deblock strength 30->18 cycles commit 594dd84cb85e616f4e260f7fdef6ce5a34360ac7 [revision 2322] Author: Henrik Gramner Date: Wed May 1 17:42:48 2013 +0200 x86: Faster high bit-depth intra_sad_x3_4x4 20->16 cycles on Ivy Bridge commit a8384178bd917576469da040923976cb531be38c [revision 2321] Author: Fiona Glaser Date: Tue Apr 30 17:36:46 2013 -0700 x86: faster SSSE3 hpel ~7% faster using the pmulhrsw trick from mc_chroma. commit 1f5a32c2459ed6f42d9c150d008e3471d61af3ee [revision 2320] Author: Fiona Glaser Date: Mon Apr 29 14:22:23 2013 -0700 x86-64: faster SSSE3 trellis ~2% faster trellis. commit 7cbb27f0ce5ea3e756c628ac606f65d7de57f285 [revision 2319] Author: Fiona Glaser Date: Thu May 2 17:10:26 2013 -0700 x86: 32-byte align the stack if possible Avoids the need for manual 32 byte array alignment on compilers that support -mpreferred-stack-boundary. commit 30c91f62906ce08b5d227002b38ebd64f1291fae [revision 2318] Author: Henrik Gramner Date: Sat May 11 23:39:09 2013 +0200 x86inc: Utilize the shadow space on 64-bit Windows Store XMM6 and XMM7 in the shadow space in functions that clobbers them. This way we don't have to adjust the stack pointer as often, reducing the number of instructions as well as code size. commit 33c352673900bd1b362bb2fe0284e999fccd633d [revision 2317] Author: Henrik Gramner Date: Fri May 3 23:06:10 2013 +0200 x86: Don't use explicitly aligned versions of SAD on AVX CPUs On modern CPUs movdqu isn't slower than movdqa when used on aligned data and using the same code in both cases saves cache. This was already done for the high bit-depth AVX2 implementation but the aligned version still exists as dead code so remove that. commit 16d037211f1dd032288e25ab74d93a569fd93d6c [revision 2316] Author: Henrik Gramner Date: Fri May 3 20:18:03 2013 +0200 x86: Add missing initializations for high bit-depth sad_aligned commit 25e219ad2565e52a6962eb1e16cf19f3482e655b [revision 2315] Author: Fiona Glaser Date: Mon May 13 16:52:18 2013 -0700 x86: add Jaguar CPU detection commit c1e37099627b1dc2f15b295aa4c2eedd431a6dba [revision 2314] Author: Henrik Gramner Date: Tue May 7 17:21:03 2013 +0200 x86inc: Remove .rodata kludges The Mach-O bug was fixed in yasm 0.8.0 and we don't support versions that old. a.out was superseded by ELF on sane systems a few decades ago. commit 5444e95a5c9ee866625b1122a19dbae6bf044008 [revision 2313] Author: Henrik Gramner Date: Sat May 4 16:21:32 2013 +0200 checkasm: Use 64-bit cycle counters Prevents overflows that can occur in some cases. commit 0e000e7a763c9bb5c14257bad365144025013fc9 [revision 2312] Author: Henrik Gramner Date: Fri May 10 13:55:32 2013 +0200 checkasm: Fix stack alignment bug commit 3ba0fb847b1a14f9db5f3dabe209eee2d4edc91d [revision 2311] Author: Fiona Glaser Date: Wed May 8 10:48:41 2013 -0700 Fix invalid memcpy in sliced-threads Likely didn't actually break in practice, but memcpy with src==dst is incorrect. commit 7f3606572957b63f1169bc793ed55bccdb549d56 [revision 2310] Author: Fiona Glaser Date: Mon Apr 29 12:14:01 2013 -0700 Fix two bugs in slice-min-mbs and slices-max Slices-max broke slice-max-size when slice-max wasn't used. Slice-min-mbs broke in rare cases near the end of a threadslice. commit 67d6f602018d0fc1cb05cd6240e4fe1c2646169f [revision 2309] Author: Fiona Glaser Date: Thu Apr 4 18:00:23 2013 -0700 x86: SSSE3 LUT-based faster coeff_level_run ~2x faster coeff_level_run. Faster CAVLC encoding: {1%,2%,7%} overall with {superfast,medium,slower}. Uses the same pshufb LUT abuse trick as in the previous ads_mvs patch. commit c17d12f83381913650d84004815c20a1f7092144 [revision 2308] Author: Fiona Glaser Date: Mon Mar 25 14:03:37 2013 -0700 x86-64: BMI2 cabac_residual functions commit 40316f836d42cb5aee8de5ae6b4a5e417d8446f8 [revision 2307] Author: Fiona Glaser Date: Wed Mar 20 15:08:35 2013 -0700 x86: SSSE3 ads_mvs ~55% faster ads in benchasm, ~15-30% in real encoding. ~4% faster "placebo" preset overall. commit 03396f82bd1a709aa83d15de0affd0c4c5bd621d [revision 2306] Author: Henrik Gramner Date: Tue Apr 16 23:27:53 2013 +0200 x86: AVX2 pixel_ssd_nv12_core commit dc05aebbc51b64b6cf3cfa95a1fbb20f6ffe94c6 [revision 2305] Author: Henrik Gramner Date: Tue Apr 16 23:27:50 2013 +0200 x86: AVX2 high bit-depth pixel_ssd commit f49c2eba352a9087301dfc3c3de902ab083bd9e9 [revision 2304] Author: Henrik Gramner Date: Tue Apr 16 23:27:46 2013 +0200 x86: AVX2 high bit-depth pixel_sad_x3/pixel_sad_x4 Also reduce the number of xmm registers used by sse2/ssse3 pixel_sad_x3. commit 0e69048d4f9664f1293c5eed0604522c67adaff5 [revision 2303] Author: Henrik Gramner Date: Tue Apr 16 23:27:43 2013 +0200 x86: AVX2 high bit-depth vsad commit 9f885c112d6566388d472da68ada0301ce330311 [revision 2302] Author: Henrik Gramner Date: Tue Apr 16 23:27:39 2013 +0200 x86: AVX2 high bit-depth pixel_sad Also use loops instead of duplicating code; reduces code size by ~10kB with negligible effect on performance. commit 295f83af2afa93073d7810ab96b1d8d889a53ed2 [revision 2301] Author: Henrik Gramner Date: Tue Apr 16 23:27:35 2013 +0200 x86: AVX2 high_bit_depth pixel_avg2, get_ref, mc_copy_w16, mc_luma Also reduce the number of xmm registers used by mc_copy_* to avoid saving and restoring xmm6 and xmm7 on 64-bit Windows. commit e7a46b6536ab3ea4806f585b771b6cbb261031d1 [revision 2300] Author: Henrik Gramner Date: Tue Apr 16 23:27:32 2013 +0200 x86: AVX2 nal_escape Also rewrite the entire function to be faster and drop the AVX version which is no longer useful. commit 547a6573af56afe8d551201245775c6ba179e781 [revision 2299] Author: Henrik Gramner Date: Tue Apr 16 23:27:29 2013 +0200 x86: AVX memzero_aligned commit 0f776f63daf47eac9b69ef77aaf7c9c16213cba9 [revision 2298] Author: Henrik Gramner Date: Tue Apr 16 23:27:25 2013 +0200 x86: AVX2 predict_16x16_dc commit 97ad171ae33c51f48e6214abdf7c978e4dd5d2d1 [revision 2297] Author: Henrik Gramner Date: Tue Apr 16 23:27:22 2013 +0200 x86: AVX2 predict_8x8c_p/predict_8x16c_p commit 8ecdeb2709b4b7095237330e68e9a76ea8060a2f [revision 2296] Author: Henrik Gramner Date: Tue Apr 16 23:27:18 2013 +0200 x86: AVX2 predict_16x16_p Also fix the AVX implementation to correctly use the SSSE3 inline asm instead of SSE2. commit f3d521da8163bb9a381284ef0b5c949b8a5c9f9c [revision 2295] Author: Henrik Gramner Date: Tue Apr 16 23:27:14 2013 +0200 x86: AVX high bit-depth predict_16x16_v Also restructure some code to reduce code size of various functions, especially in high bit-depth. commit fa40b44f339501917e7a7c003ab826bf3e7b6a10 [revision 2294] Author: Henrik Gramner Date: Tue Apr 16 23:27:08 2013 +0200 x86: AVX2 high bit-depth predict_4x4_h commit 7908dc632330b6028ab7dae42834e2098e628b24 [revision 2293] Author: Henrik Gramner Date: Tue Apr 16 23:27:04 2013 +0200 x86: AVX2 high bit-depth predict_16x16_h commit 51708c3e193438439aaeaf31c377b070ca403e0e [revision 2292] Author: Henrik Gramner Date: Tue Apr 16 23:27:00 2013 +0200 x86: AVX2 high bit-depth predict_8x8c_h/predict_8x16c_h commit 184c50554ae95aa60edd3fa309ca8013e00a8648 [revision 2291] Author: Henrik Gramner Date: Tue Apr 16 23:26:47 2013 +0200 x86util: Support ymm registers in HADD macros commit 0ea5be852e97d8cfdf04e384a8a78210f87c2dc0 [revision 2290] Author: Fiona Glaser Date: Tue Feb 26 16:26:34 2013 -0800 x86: more AVX2 framework, AVX2 functions, plus some existing asm tweaks AVX2 functions: mc_chroma intra_sad_x3_16x16 last64 ads hpel dct4 idct4 sub16x16_dct8 quant_4x4x4 quant_4x4 quant_4x4_dc quant_8x8 SAD_X3/X4 SATD var var2 SSD zigzag interleave weightp weightb intra_sad_8x8_x9 decimate integral hadamard_ac sa8d_satd sa8d lowres_init denoise commit 19e1a2bbf2d1aaa15ea2d2c118b0236ff64b4bd1 [revision 2289] Author: Loren Merritt Date: Mon Feb 25 21:16:45 2013 +0000 x86inc: create xm# and ym#, analagous to m# For when we want to mix simd sizes within one function. commit 3a8dfb2bc62be21215b6f7d47c53c5a912878656 [revision 2288] Author: Fiona Glaser Date: Fri Apr 5 16:08:35 2013 -0700 x86inc: fix AVX emulation of cmp(p|s)(s|d) commit a3f5c7326c0aa707ccbd5a938a0b65581888b549 [revision 2287] Author: Fiona Glaser Date: Tue Feb 5 17:15:00 2013 -0800 x86-64: cabac_block_residual assembly RDO: ~20% faster than C Bitstream: ~50% faster than C 1-2% faster overall, highest on preset superfast/fast/medium. commit f49a1b2ef6d95d8f0f186df0fc3bfe38414e264f [revision 2286] Author: Steve Borho Date: Thu Feb 21 12:48:40 2013 -0600 OpenCL lookahead OpenCL support is compiled in by default, but must be enabled at runtime by an --opencl command line flag. Compiling OpenCL support requires perl. To avoid the perl requirement use: configure --disable-opencl. When enabled, the lookahead thread is mostly off-loaded to an OpenCL capable GPU device. Lowres intra cost prediction, lowres motion search (including subpel) and bidir cost predictions are all done on the GPU. MB-tree and final slice decisions are still done by the CPU. Presets which do not use a threaded lookahead will not use OpenCL at all (superfast, ultrafast). Because of data dependencies, the GPU must use an iterative motion search which performs more total work than the CPU would do, so this is not work efficient or power efficient. But if there are spare GPU cycles to spare, it can often speed up the encode. Output quality when OpenCL lookahead is enabled is often very slightly worse in quality than the CPU quality (because of the same data dependencies). x264 must compile its OpenCL kernels for your device before running them, and in order to avoid doing this every run it caches the compiled kernel binary in a file named x264_lookahead.clbin (--opencl-clbin FNAME to override). The cache file will be ignored if the device, driver, or OpenCL source are changed. x264 will use the first GPU device which supports the required cl_image features required by its kernels. Most modern discrete GPUs and all AMD integrated GPUs will work. Intel integrated GPUs (up to IvyBridge) do not support those necessary features. Use --opencl-device N to specify a number of capable GPUs to skip during device detection. Switchable graphics environments (e.g. AMD Enduro) are currently not supported, as some have bugs in their OpenCL drivers that cause output to be silently incorrect. Developed by MulticoreWare with support from AMD and Telestream. commit 2d0c47a50622ec59ade303cf150c21b8910a2bce [revision 2285] Author: Fiona Glaser Date: Mon Mar 4 15:19:47 2013 -0800 weightp: improve scale/offset search, chroma Rescale the scale factor if the offset clips. This makes weightp more effective in fades to/from white (and an other situation that requires big offsets). Search more than 1 scale factor and more than 1 offset, depending on --subme. Try to find the optimal chroma denominator instead of hardcoding it. Overall improvement: a few percent in fade-heavy clips, such as a sample from Avatar: TLA. commit 732e4f7e8b9ab6d214cbcf059445b4712709faa4 [revision 2284] Author: Fiona Glaser Date: Tue Feb 19 13:48:44 2013 -0800 Add slices-max feature The H.264 spec technically has limits on the number of slices per frame. x264 normally ignores this, since most use-cases that require large numbers of slices prefer it to. However, certain decoders may break with extremely large numbers of slices, as can occur with some slice-max-size/mbs settings. When set, x264 will refuse to create any slices beyond the maximum number, even if slice-max-size/mbs requires otherwise. commit fdfffa3058cb590765dbb34afa5706755dcb5319 [revision 2283] Author: Fiona Glaser Date: Thu Feb 14 17:22:02 2013 -0800 Add slice-min-mbs feature Works in conjunction with slice-max-mbs and/or slice-max-size to avoid overly small slices. Useful with certain decoders that barf on extremely small slices. If slice-min-mbs would be violated as a result of slice-max-size, x264 will exceed slice-max-size and print a warning. commit 8a3a41de9e5f54cb6e7b5c69486e50471a5c022d [revision 2282] Author: Anton Mitrofanov Date: Tue Mar 26 18:56:21 2013 +0400 Disable mbtree asm with cpu-independent option Results vary between versions because of different rounding results. commit bf52bab4e5607d7f3d98b3999a13cb8149aeef1c [revision 2281] Author: Anton Mitrofanov Date: Tue Mar 26 18:30:00 2013 +0400 Show "avs: no" --disable-avs option instead of empty string commit e74287e93b0ee7afb384624f60dc440b736fec6b [revision 2280] Author: Tim Walker Date: Tue Mar 19 23:42:43 2013 +0100 lavf input: don't use deprecated AVStream fields Fixes building against newer libavcodecs from the Libav project. commit aa73459b710f4c08b654d69573c22fd2fc2a99f8 [revision 2279] Author: Anton Mitrofanov Date: Tue Mar 26 19:54:36 2013 +0400 Fix y4m input with C420paldv colorspace commit 42c500af62fbe09e7a55ecd47fc72331fbe4ae02 [revision 2278] Author: Fiona Glaser Date: Sat Mar 2 01:22:29 2013 -0800 x86: correctly check stack alignment for Atom hadamard_ac Regression in r2265 (only affected compilers with broken stack alignment, like ICL on win32). commit bed18d0e4545e7528bf585a1a3c7fbc05ddbafa4 [revision 2277] Author: Loren Merritt Date: Mon Feb 25 21:23:55 2013 +0000 x86inc: fix some corner cases of SWAP SWAP with >=3 named (rather than numbered) args PERMUTE followed by SWAP with 2 named args used to produce the wrong permutation commit 3cdaca1ac2f6022b1affcd24eff397a03b59fce3 [revision 2276] Author: Fiona Glaser Date: Wed Feb 27 13:30:22 2013 -0800 Fix array overreads that caused miscompilation in gcc 4.8 commit 37033444036210ddab75c3ec5b9b5c2a5abb9d52 [revision 2275] Author: Fiona Glaser Date: Thu Feb 28 13:32:37 2013 -0800 Fix undefined behavior in x264_ratecontrol_mb commit cb4547aefb624105b622368aad62c947f89cc4b1 [revision 2274] Author: Stefan Groenroos Date: Fri Mar 1 22:35:34 2013 +0200 ARM: Fix bug in x264_quant_4x4x4_neon Regression in r2273. commit 3a8baa0ec68c50db3194ed778d0e744d6311cda3 [revision 2273] Author: Stefan Groenroos Date: Mon Feb 25 23:43:09 2013 +0200 ARM: update NEON mc_chroma to work with NV12 and re-enable it Up to 10-15% faster overall. commit 215f2beeadb2ade3a318b397f25b8a6ad3a761d1 [revision 2272] Author: Fiona Glaser Date: Thu Feb 14 15:00:48 2013 -0800 CABAC/CAVLC: use the new bit-iterating macro here too commit 993c81e94eebaacddbbfcec665831d07d89490b7 [revision 2271] Author: Fiona Glaser Date: Fri Feb 8 15:34:38 2013 -0800 quant_4x4x4: quant one 8x8 block at a time This reduces overhead and lets us use less branchy code for zigzag, dequant, decimate, and so on. Reorganize and optimize a lot of macroblock_encode using this new function. ~1-2% faster overall. Includes NEON and x86 versions of the new function. Using larger merged functions like this will also make wider SIMD, like AVX2, more effective. commit 5ee1d03a8b86915d98b165d067dce377df3a87ba [revision 2270] Author: Stephen Hutchinson Date: Tue Feb 12 21:55:43 2013 -0500 Add AvxSynth support to the AviSynth input module. Uses dlopen to load AvxSynth on Linux and OS X. Allows the use of --demuxer avs for AvxSynth, though the only source filter it can currently use is FFMS2. Add a local copy of avxsynth_c.h and its dependent headers in extras/ so that users don't need to actually have AvxSynth development headers installed to enable support for it (mirroring the AviSynth behavior). Based on a patch by 0x09 (tab@lavabit.com) commit 7b1301e946218cfe6e072fea03702754ee0cc8a6 [revision 2269] Author: Fiona Glaser Date: Fri Feb 8 00:13:15 2013 -0800 Eliminate some branchiness in ME/analysis Faster, fewer branch mispredictions. commit 7de9a9aa4bc06843dd7d8afe6bc42c02e27b6b73 [revision 2268] Author: Fiona Glaser Date: Wed Feb 6 16:55:39 2013 -0800 Fix some store forwarding stalls There's quite a few others, but most of them don't help to fix or there's no easy way to avoid them. commit 68a6268bae989c55a02b7e86b169bd1a02793a95 [revision 2267] Author: Fiona Glaser Date: Tue Feb 5 01:23:23 2013 -0800 x86: faster AVX satd/sa8d/sa8d_satd/hadamard_ac Use Conroe-style movddup in AVX transforms; both Sandy Bridge and Bulldozer do movddup in the load unit, so it's totally free this way. On Sandy Bridge: ~6% faster sa8d_satd ~5% faster hadamard_ac ~9% faster 32-bit satd ~2% faster sa8d commit 5d60b9c9ad794a666d0cfe0dd9d66d5b9f58e033 [revision 2266] Author: Fiona Glaser Date: Sat Feb 2 12:37:08 2013 -0800 x86: detect Bobcat, improve Atom optimizations, reorganize flags The Bobcat has a 64-bit SIMD unit reminiscent of the Athlon 64; detect this and apply the appropriate flags. It also has an extremely slow palignr instruction; create a flag for this to avoid massive penalties on palignr-heavy functions. Improve Atom function selection and document exactly what the SLOW_ATOM flag covers. Add Atom-optimized SATD/SA8D/hadamard_ac functions: simply combine the ssse3 optimizations with the sse2 algorithm to avoid pmaddubsw, which is slow on Atom along with other SIMD multiplies. Drop TBM detection; it'll probably never be useful for x264. Invert FastShuffle to SlowShuffle; it only ever applied to one CPU (Conroe). Detect CMOV, to fail more gracefully when run on a chip with MMX2 but no CMOV. commit 75d927053ef5546eb011ff5a5ac19152dd4e3c63 [revision 2265] Author: Oskar Arvidsson Date: Sat Jan 19 01:47:09 2013 +0100 x86: combined SA8D/SATD dsp function Speedup is most apparent for 8-bit (~30%), but gives some improvements for 10-bit too (~12%). 64-bit only for now. commit 790c648d939240808659228f57a22633fc59d6d8 [revision 2264] Author: Oskar Arvidsson Date: Tue Jan 29 23:44:32 2013 +0100 x86: port SSE2+ SATD functions to high bit depth Makes SATD 20-50% faster across all partition sizes but 4x4. commit 93bf1248f7409958818b281e3e6ecca75ddb8d86 [revision 2263] Author: Oskar Arvidsson Date: Wed Feb 6 02:07:53 2013 +0100 x86: faster high bit depth ssd About 15% faster on average. commit 6371c3a527a337c7521912990c89d0474288e105 [revision 2262] Author: Fiona Glaser Date: Fri Jan 18 22:55:46 2013 -0800 x86: optimize and clean up predictor checking Branchlessly handle elimination of candidates in MMX roundclip asm. Add a new asm function, similar to roundclip, except without the round part. Optimize and organize the C code, and make both subme>=3 and subme<3 consistent. Add lots of explanatory comments and try to make things a little more understandable. ~5-10% faster with subme>=3, ~15-20% faster with subme<3. commit 004640653ded52f447ffdb71a45b334dc8e6f3d1 [revision 2261] Author: Fiona Glaser Date: Tue Jan 22 12:31:55 2013 -0800 Fix two bugs in predictor checking pmv wasn't checked properly in some cases, as well as zero vector. Output-changing portion of the following patch. commit d2a9d25429b6843874865a37a5b4f6b401d89abc [revision 2260] Author: Fiona Glaser Date: Thu Jan 10 13:15:52 2013 -0800 Improve lookahead-threads auto selection Smarter decision to improve fast-first-pass performance in 2-pass encodes. Dramatically improves CPU utilization on multi-core systems. Tested on a quad-core Ivy Bridge (12 threads, 1080p): Fast first pass: veryfast: ~7% faster faster: ~11% faster fast/medium: ~15% faster slow/slower: ~42% faster veryslow: ~55% faster CRF/1-pass: veryfast: ~9% faster (all others remained the same) commit 5a764328bdeba650d99fc8db47275708cce79521 [revision 2259] Author: Henrik Gramner Date: Sun Jan 27 23:01:59 2013 +0100 x86: Use SSE instead of SSE2 for copying data Reduces code size because movaps/movups is one byte shorter than movdqa/movdqu. Also merge MMX and SSE versions of memcpy_aligned into a single macro. commit c3983b811f42ae5e4bc4f9c1c919f8e548fc76e3 [revision 2258] Author: Henrik Gramner Date: Sun Jan 13 18:27:08 2013 +0100 64-bit cabac optimizations ~4% faster PIC WIN64: ~3% faster and 16 byte shorter cabac_encode_bypass ~8% faster cabac_encode_terminal Benchmarked on Ivy Bridge UNIX64: One instruction less in cabac_encode_bypass commit f6e0d28ae1bccbda43d95200162f7035661fe1e4 [revision 2257] Author: Mike Gorchak Date: Sat Feb 2 23:35:00 2013 -0800 configure: add QNX support commit 5e0fca86444840752eaedbdc5ebfe4ac0b3a0053 [revision 2256] Author: Henrik Gramner Date: Sun Jan 20 19:35:06 2013 +0100 Windows: Enable DEP and ASLR commit 5ec5c78920914a88da415c57904fa01c99deeb7b [revision 2255] Author: Henrik Gramner Date: Thu Jan 17 19:17:24 2013 +0100 x86inc: Set ELF hidden visibility for global constants commit fd2c4a06c3a4eb02fc1375de782bc5d36eb1d744 [revision 2254] Author: Diego Biurrun Date: Thu Jan 17 11:18:31 2013 +0100 x86inc: Add cvisible macro for C functions with public prefix This allows defining externally visible library symbols. Signed-off-by: Diego Biurrun commit faf3dbe616c8339590409c9aa25777fa76c987a6 [revision 2253] Author: Diego Biurrun Date: Thu Jan 17 11:30:37 2013 -0800 x86inc: rename program_name to private_prefix Synced from libav. The new name is more descriptive and will allow defining a separate public prefix for externally visible library symbols. commit 32695340b0e93e3cc7edd1b5e7db064d94cd3701 [revision 2252] Author: Fiona Glaser Date: Mon Jan 14 05:35:30 2013 -0800 x264.h: improve x264_encoder_reconfig documentation commit 6a82e49370e46914ab479d57548508ccf29da6e5 [revision 2251] Author: Henrik Gramner Date: Sat Feb 16 19:36:50 2013 +0100 Cosmetics: stricter definition of parameterless functions commit b671762973a162705ceacf924a29999cdc6d35d2 [revision 2250] Author: Neil Date: Mon Jan 28 10:47:38 2013 +0800 Update "Install and compile x264" in doc/regression_test.txt commit 43ff8f1681c1cca997ca916508723abea85d0fa2 [revision 2249] Author: Anton Mitrofanov Date: Thu Jan 24 12:11:26 2013 +0400 Fix possible non-determinism with mbtree + open-gop + sync-lookahead Code assumed keyframe analysis would only pull one frame off the list; this isn't true with open-gop. commit c2c2a95708685156a643e920b497d48597e0267c [revision 2248] Author: Anton Mitrofanov Date: Mon Feb 25 19:28:19 2013 +0400 x86: don't use the red zone on win64 commit 5743b19a8264415ab3ed443abd2fefd81a038d6a [revision 2247] Author: Fiona Glaser Date: Sun Feb 10 16:12:34 2013 -0800 x86-64: fix trellis asm with interlacing Regression in r2145. Assembly assumed array was [2][64] when it was actually [2][63]. Tiny (~0.1%) compression improvement. commit 9475e6ac48af90e526611b5f11a2690fa077b0ba [revision 2246] Author: Ronald S. Bultje Date: Wed Jan 30 09:48:14 2013 -0800 x86-32: use simple nop codes for <= sse The "CentaurHauls family 6 model 9 stepping 8" family of CPUs (flags: fpu vme de pse tsc msr cx8 sep mtrr pge mov pat mmx fxsr sse up rng rng_en ace ace_en) SIGILLs on long nop codes. commit 732b072ae236b57cabdbc3b31cd7b482d1f9f9ff [revision 2245] Author: Loren Merritt Date: Tue Jan 8 21:30:57 2013 +0000 Bump dates to 2013 commit f2b4f29c636d5e5c223650c5b22bd8089adfcab9 [revision 2244] Author: Henrik Gramner Date: Mon Dec 17 21:54:00 2012 +0100 x86inc: Drop tzcnt workaround It is no longer needed now that we've bumped the version requirement of yasm to 1.2.0. commit ccda1ba4d8d902945c68aa25ec20867055d1b079 [revision 2243] Author: Fiona Glaser Date: Mon Nov 12 10:28:53 2012 -0800 AVX2/FMA3 version of mbtree_propagate First AVX2 function for testing. Bump yasm version to 1.2.0 for AVX2 support. commit 8a9608bbbdf77ceb3ee537271549111468175a2b [revision 2242] Author: Henrik Gramner Date: Tue Dec 11 16:05:34 2012 +0100 x86inc: Use VEX-encoded instructions in AVX functions Automatically use VEX-encoding in AVX/AVX2/XOP/FMA3/FMA4 functions for all instructions that exists in a VEX-encoded version. This change makes it easier to extend existing code to use AVX2. Also add support for AVX emulation of a few instructions that were missing before. commit 4cf272851a9d24aacdf664f27a87ebdbfb50e6c2 [revision 2241] Author: Loren Merritt Date: Sun Dec 2 15:56:30 2012 +0000 x86inc: activate REP_RET automatically Now RET checks whether it immediately follows a branch, so the programmer dosen't have to keep track of that condition. REP_RET is still needed manually when it's a branch target, but that's much rarer. The implementation involves lots of spurious labels, but that's ok because we strip them. commit b073e870d135ac27cd97d624330abf0f1fb1ed41 [revision 2240] Author: Ronald S. Bultje Date: Thu Dec 6 15:40:13 2012 -0800 x86inc: support stack mem allocation and re-alignment in PROLOGUE Use this in 8-bit loopfilter functions so they can be used if there is no aligned stack (e.g. x86-32 MSVC or ICC 10.x). commit 9d5ec55b34a4d4f2e044fbc67e2e12a59ea27d2a [revision 2239] Author: Henrik Gramner Date: Mon Dec 17 22:15:02 2012 +0100 Update config.guess and config.sub commit 8eddd52b6d5d638709c5c8278c420eac68a8dde1 [revision 2238] Author: Anton Mitrofanov Date: Tue Jan 8 13:29:49 2013 -0800 Fix crash if the first frame is forced to a non-keyframe This is obviously bad user input, but x264 shouldn't crash if it happens. commit 05c1646333f567aa3de5f7669693b15ee667825d [revision 2237] Author: Bernhard Rosenkränzer Date: Sun Dec 30 12:18:00 2012 -0800 Fix build on ARM with binutils >= 2.23.51.0.6 GAS doesn't seem to like spaces in vld1 anymore, so remove those. commit 23829dd2b2c909855481f46cc884b3c25d92c2d7 [revision 2236] Author: Anton Mitrofanov Date: Fri Nov 23 18:26:53 2012 +0400 Fix pthread_join emulation on win32 and BeOS Doesn't actually affect x264, but it's more correct. commit 042fdd3e6a0e271f62a108da2a1a244dee936045 [revision 2235] Author: Fiona Glaser Date: Tue Nov 27 07:50:51 2012 -0800 Fix typo in r2222 Slightly wrong numbers in level table. commit cd71765c0ba574bb573e75396ef3c6a5c4c00469 [revision 2234] Author: Sergio Basto Date: Thu Nov 22 18:02:50 2012 -0800 configure: fix gpac detection with -Wp,-D_FORTIFY_SOURCE=2 commit 12458a23d1374836fecbed381dfe55513b5ba119 [revision 2233] Author: Sean McGovern Date: Thu Nov 22 18:01:16 2012 -0800 Solaris: use sysconf to get processor count Solaris responds correctly to the same value as Cygwin, so let's use that. commit 0db80bee2765676c2e0e4be21afc2ace900a606c [revision 2232] Author: Anton Khirnov Date: Tue Nov 13 21:01:24 2012 +0100 lavf input: allocate AVFrame correctly Allocate AVFrames correctly with avcodec_alloc_frame(). This caused crashes with newer libavcodecs that try to free frame extradata. commit 144b79159ad20954a7faec1023451a630a65aea1 [revision 2231] Author: Anton Mitrofanov Date: Sun Nov 11 03:44:02 2012 +0400 Fix crash when using libx264.dll compiled with ICL for X86_64 commit bfed708c5358a2b4ef65923fb0683cefa9184e6f [revision 2230] Author: Anton Mitrofanov Date: Fri Nov 9 02:31:10 2012 +0400 Fix possible issues with out-of-spec QP values Fixes a possible regression in r2228. commit 1580a74e339c59cd856100076d8cf46f2d7247b0 [revision 2229] Author: Fiona Glaser Date: Wed Sep 26 13:49:02 2012 -0700 Attempt to optimize PPS pic_init_qp in 2-pass mode Small compression improvement; up to ~0.5% in extreme cases. Helps more with small slice sizes (tiny resolutions or slice-max-size). Note that this changes the 2-pass stats file format. commit b304a7cad10a85d487fa09e7c33e81c6945186b2 [revision 2228] Author: Fiona Glaser Date: Wed Sep 26 13:05:00 2012 -0700 Improve slice header QP selection Use the first macroblock of each slice instead of the last of the previous. Lets us pick a reasonable initial QP for the first slice too. Slightly improved compression. commit 0d5f6fbae9f6c4dbba25571a5d8c643b192606b1 [revision 2227] Author: Fiona Glaser Date: Thu Oct 11 13:27:48 2012 -0700 Update level dpb size calculation to match newer H.264 spec Doesn't actually change encoding behavior, but makes it more correct. Warning messages should now be accurate at higher bit depths and non-4:2:0. Technically, since it redefines x264_level_t, this is an API version increment. commit cc61a4b4d0838b6d5f4cdaf88a0b6d06a12b6d3e [revision 2226] Author: Jan Ekström Date: Sun Oct 7 21:12:05 2012 +0300 Add support for the ffmpeg/vapoursynth high bit depth y4m extensions commit 5d85879921481ef104766657deda4ef8ea4351ec [revision 2225] Author: Diego Biurrun Date: Tue Nov 6 14:48:56 2012 +0100 x86inc: Rename 3dnow2 to 3dnowext The name "3dnowext" is more common than "3dnow2". Doesn't affect x264. commit 00cc16001b35a71ce2329e02bff6e316201cf700 [revision 2224] Author: Diego Biurrun Date: Wed Oct 31 12:23:54 2012 -0700 x86inc: only define program_name if the macro is unset. This allows overriding the value from outside the file. This can be useful if x86inc.asm is used outside of x264. commit 3f516c5238d0c536ea03c8e5334d231facf9f31b [revision 2223] Author: David Wolstencroft Date: Mon Oct 29 09:07:39 2012 -0700 Disable ARM NEON MRC CPU test for Apple devices The Apple A6 CPU doesn't support performance counters, so this test caused a crash. commit ac2d7c08452186703424dcc6933524e95b652479 [revision 2222] Author: Fiona Glaser Date: Tue Nov 6 12:03:20 2012 -0800 Fix crash with no-scenecut + mbtree commit 480bbc9067da7cce3400cf3988bf5fdfa4d9fa3f [revision 2221] Author: Anton Mitrofanov Date: Fri Oct 12 23:43:40 2012 +0400 Fix reconfiguring to crf=0 Lossless mode can't currently be enabled mid-stream. commit 21ba91ae6c361e4ce49ee65e61cc582b1af648ba [revision 2220] Author: Derek Buitenhuis Date: Mon Sep 17 11:09:20 2012 -0700 Fix ALIGNED_ARRAY_EMU macros on ICL ICL's preprocessor doesn't handle it correctly. This fix is similar to libav's fix in 0db2d9. commit 96577475981d979d151626aae61ef317dc54df67 [revision 2219] Author: Jason Martens Date: Thu Sep 13 11:20:40 2012 -0700 Fix use of deprecated av_close_input_file call commit 02217bd2c31feda7aaca813f104c155fe09428b8 [revision 2218] Author: Brad Smith Date: Wed Sep 26 14:13:27 2012 -0700 Fix pkg-config for dynamic vs static linking commit e8e8b9a44ffa9b5f585582375515140ea22985d3 [revision 2217] Author: Brad Smith Date: Mon Sep 10 17:52:04 2012 -0700 Set libm in the configure script if the OS has libm Prerequisite for another configure patch after this. Idea copied from libpthread. commit 8980dd8afbfeeb6bcaa17b97aad0b3c24207665e [revision 2216] Author: Fiona Glaser Date: Thu Aug 16 13:40:32 2012 -0700 Enhance mb_info: add mb_info_update This feature lets the callee know which decoded macroblocks have changed. commit 033df0a8c719f991ab0e0bb0788bd4f08e8b91d7 [revision 2215] Author: Fiona Glaser Date: Thu Aug 16 13:01:17 2012 -0700 Fix mb_info_free with sliced threads x264 would free mb_info before it was completely done using it. commit f93b7865a96248621af078363d5b59691cbcd8aa [revision 2214] Author: Fiona Glaser Date: Tue Aug 7 12:43:26 2012 -0700 Enhance nalu_process Add the input frame opaque pointer to the arguments. This makes it easier to use with multiple simultaneous x264 encodes. commit 05089a37bf55a4134d9ffd014fdae729804a4e7a [revision 2213] Author: Fiona Glaser Date: Mon Aug 6 14:55:35 2012 -0700 Improve mb_info constant mb optimization Allow fast skipping even if the pskip MV isn't zero. commit cc5dcedc3b45d8e7390e2e914bb37f3fa92f6acd [revision 2212] Author: Fiona Glaser Date: Mon Jul 30 12:58:34 2012 -0700 Export the average effective CRF of each frame Useful to judge the resulting quality of a frame when VBV is enabled. commit f8fd6412a94f5f4f0eb5f8a6c0fb2062daebfab8 [revision 2211] Author: Brad Smith Date: Mon Aug 20 23:58:19 2012 -0700 Remove special-casing for OpenBSD pthread handling Previously it was policy to use -pthread, but OpenBSD now recommends -lpthread. its been libpthread anyway and policy has changed to stop using -pthread. commit ed56837e3c56bfb880fac2e4e0025d81d6a7186b [revision 2210] Author: Ronald S. Bultje Date: Thu Jul 26 18:01:49 2012 -0700 x86inc: automatically insert vzeroupper for YMM functions Backported from libav. commit cbb90707e443f0da2521bda1b98cab5705451b8f [revision 2209] Author: Kieran Kunhya Date: Tue Jul 24 08:47:45 2012 -0700 Free user supplied data when deleting a frame This eliminates a memory leak when calling x264_encoder_close. commit 3d03b6190c7af7b941fa746c3dff3b17e5115380 [revision 2208] Author: Fiona Glaser Date: Wed Jul 18 08:33:41 2012 -0700 Revert r2204 People don't seem to like this so I'm just going to get rid of it. commit 2ec694181f8ba3eb1c4153e6b955d399d6448c25 [revision 2207] Author: Fiona Glaser Date: Tue Jul 10 14:10:44 2012 -0700 Faster predictor checking with subme<3 Fix a typo that made an early-skip less effective. Avoid a relatively unpredictable branch. Slightly changed output due to the typo-fix. ~50 cycles faster on Core i7. commit d026397b0bf4c87e96b19c9fff7f43be6c4d9def [revision 2206] Author: Fiona Glaser Date: Mon Jun 25 18:01:29 2012 -0700 Try 8x8 transform analysis even when sub8x8 partitions are present Turn off the sub8x8 partitions, try it, and turn them back on if it didn't help. Small compression improvement with p4x4 on (~0.1-0.5%). Also update related comments. commit dea5d7a54b5ba948ed71d74e0264a2191bcd9815 [revision 2205] Author: Fiona Glaser Date: Fri Jun 8 18:19:59 2012 -0700 Support changing resolutions between passes with macroblock-tree Implement a basic separable bilinear filter to rescale the quantizer offsets. Structure inspired by swscale, but floating-point instead of fixed-point. Not as optimized as it could be, but it's quite fast already. Example compression penalties on a 720p video game recording: First pass with 720p and second as 480p: ~-1.5% (vs. same res) First pass with 480p and second as 720p: ~-3% (vs. same res) commit 498af9c559b8da986544e93f898df02fc9e224b3 [revision 2204] Author: Alexander Prikhodko Date: Tue Jun 12 20:21:35 2012 +0300 Print elapsed time in encoding progress indicator commit bcd1a7070dc5224d591731dfdbabcbbaee0bb984 [revision 2203] Author: Anton Mitrofanov Date: Sat Jun 2 21:27:50 2012 +0400 Cap ratecontrol predictor parameters Limits VBV mispredictions after long periods of relatively constant video. commit 5754ea2db5223b458bd48f0130c13000e3dec15c [revision 2202] Author: Loren Merritt Date: Tue Jul 3 14:38:04 2012 -0700 x86inc: import patches from libav Allow manual invocation of WIN64_SPILL_XMM even under INIT_MMX SSE version of mova is movaps rather than movdqa. YMM version of movnta. Add mp size for named arguments. Fix DEFINE_ARGS when used outside of a cglobal. Define a few more cpuflags. 3-argument wrappers for a few more instructions. commit 5e3aaf1a49b173df916a384942c8089dd5bd8a22 [revision 2201] Author: Anton Mitrofanov Date: Fri Jun 22 22:02:24 2012 +0400 Fix crash with --fps 0 Fix some integer overflows and check input parameters better. Also fix incorrect type specifiers for demuxer info printing. commit df700eae5d5ce5732f80df9ce81e6d3fe99ef56a [revision 2200] Author: Fiona Glaser Date: Tue May 8 15:42:56 2012 -0700 Threaded lookahead Split each lookahead frame analysis call into multiple threads. Has a small impact on quality, but does not seem to be consistently any worse. This helps alleviate bottlenecks with many cores and frame threads. In many case, this massively increases performance on many-core systems. For example, over 100% faster 1080p encoding with --preset veryfast on a 12-core i7 system. Realtime 1080p30 at --preset slow should now be feasible on real systems. For sliced-threads, this patch should be faster regardless of settings (~10%). By default, lookahead threads are 1/6 of regular threads. This isn't exacting, but it seems to work well for all presets on real systems. With sliced-threads, it's the same as the number of encoding threads. commit 7cfe43cc7fb5474a87f02da96ebb850cdf83d73b [revision 2199] Author: Anton Mitrofanov Date: Fri May 4 17:18:12 2012 +0400 Add support for RGB formats in bit-depth conversion filter commit 44d2f0885cd95201b67ed54bab88e91f4ba1556e [revision 2198] Author: Anton Mitrofanov Date: Sat May 12 13:57:49 2012 +0400 Fix some bugs in mb_info code commit 8e57a9a0b5bddfecea5e45345c8c50efb0bac10d [revision 2197] Author: Fiona Glaser Date: Thu Mar 29 14:14:07 2012 -0700 Add mb_info API for signalling constant macroblocks Some use-cases of x264 involve encoding video with large constant areas of the frame. Sometimes, the caller knows which areas these are, and can tell x264. This API lets the caller do this and adds internal tracking of modifications to macroblocks to avoid problems. This is really only suitable without B-frames. An example use-case would be using x264 for VNC. commit 4442eaceb4992098e1e4e30aa13e70bb35d2cae6 [revision 2196] Author: Henrik Gramner Date: Sat Apr 7 00:40:09 2012 +0200 Faster chroma weight cost calculation New assembly function with SSE2, SSSE3 and XOP implementations for calculating absolute sum of differences. commit e8952dffa3b09700e5b7c5e56edd196f0b80a248 [revision 2195] Author: Lucien Date: Sat Mar 31 13:42:49 2012 +0100 Add Level 5.2 support commit 66acbbf6ce6b143cd164d251ceb160870e4ee720 [revision 2194] Author: Henrik Gramner Date: Thu Apr 12 19:14:43 2012 +0200 Eradicate all mention of Extended Profile x264 never supported it and never will because nobody uses it. commit b0f44f9e106afadaded17009079c2281cb18eb56 [revision 2193] Author: Anton Mitrofanov Date: Tue Apr 3 21:46:52 2012 +0400 Fix disabling of mbtree when using 2pass encoding and zones commit ffea9f51f7f7e0a550c9326631a9c6f8c5c885be [revision 2192] Author: Alexander Prikhodko Date: Sat Mar 31 12:06:21 2012 +0300 configure: force select -mXX gcc option for i386/x86-64 Makes multilib compilation more convenient. commit f4aefb3853819adf633c56062d1be77db90819b6 [revision 2191] Author: Rafaël Carré Date: Sun Apr 15 21:20:14 2012 -0400 Update config.guess and config.sub Adds support for a bunch of targets, including: aarch64 (armv8) arm-linux-androideabi commit 62d7007d35c5f0829d96b6ecf459f21d27210ef3 [revision 2190] Author: Alexander Prikhodko Date: Sat Mar 31 11:33:41 2012 +0300 configure: correct use of RC variable and add --extra-rcflags commit 70877e39a4abb4c24d1978a28202c9bf0dce8b47 [revision 2189] Author: Steven Walters Date: Wed Mar 28 21:15:04 2012 -0400 ICL/MSVS: Fix shared library generation and usage MSVS requires exported variables to be declared with the DATA keyword, and requires that imported variables be declared with dllimport. This does not fix x264 cli being unable to use a shared library built by ICL however. commit 52f7a149ef6c39eb0d7eec7884362ba31a4b05ba [revision 2188] Author: Kieran Kunhya Date: Tue Mar 27 17:38:56 2012 +0100 Fix intra-refresh + hrd commit fff12b1b7d8ce5cc9cfcfac09f089bae06cac6d5 [revision 2187] Author: Anton Mitrofanov Date: Sun Mar 25 17:34:24 2012 +0400 Fix frame input colorspace check commit 065fec2704f3c8c6f3f3f5b0fad6870a078ba48c [revision 2186] Author: Fiona Glaser Date: Thu Mar 22 13:56:50 2012 -0700 Fix comment in deblock.c The code does, in fact, handle CAVLC+8x8dct correctly already. commit bca412764eb198433ca45abd097368e5154c7fbb [revision 2185] Author: Fiona Glaser Date: Tue Mar 13 14:37:26 2012 -0700 Fix sliced-threads ratecontrol bug Was using qp instead of qscale; could cause NANs (not to mention less accurate results). commit e046ba72a4230fdd6c7907ebf7ae235edb98faf2 [revision 2184] Author: Anton Mitrofanov Date: Sun Mar 11 23:08:18 2012 -0700 Fix clobbering of mutex/cvs Regression in r2183. Bizarrely seemed to work on many platforms, but crashed on win64 and may have been slower. Only affected sliced threads during encoding, but could cause crashes on x264 encoder close even without sliced threads. commit a155572ed547a3627ef00ca70ab804ff452147cd [revision 2183] Author: Fiona Glaser Date: Fri Feb 24 13:34:39 2012 -0800 Sliced-threads: do hpel and deblock after returning Lowers encoding latency around 14% in sliced threads mode with preset superfast. Additionally, even if there is no waiting time between frames, this improves parallelism, because hpel+deblock are done during the (singlethreaded) lookahead. For ease of debugging, dump-yuv forces all of the threads to wait and finish instead of setting b_full_recon. commit 90408ecab16a06ceaa181ff2e495b8f1a9d170fa [revision 2182] Author: Fiona Glaser Date: Fri Feb 24 13:16:52 2012 -0800 Add full-recon API option Fully reconstruct frames even without dump-yuv. commit 5b2c62aec269be7d0b1ff62df09660369f4e20e0 [revision 2181] Author: Fiona Glaser Date: Wed Feb 22 13:33:36 2012 -0800 x86inc: switch to amdnops Recent AMD CPUs' instruction decoders choke horribly on extremely long nops (i.e. with 4 prefixes). Won't affect much, since we don't use ALIGN much. commit 42db5e6f8f704a2b0a9edf5d9cd4a17d80e5b816 [revision 2180] Author: Fiona Glaser Date: Tue Feb 14 16:54:03 2012 -0800 BMI1 decimate functions Intel was nice enough to make tzcnt equal to "rep bsf", which is backwards-compatible. This means we don't actually have to add new functions to make it work. commit 92b0bd9665860d7b48f313d6fd72a583ecb01ddf [revision 2179] Author: Fiona Glaser Date: Tue Feb 14 15:07:10 2012 -0800 Minor asm changes commit 2535ba17b2598f4155955857c12d52a377a75517 [revision 2178] Author: Fiona Glaser Date: Thu Feb 9 14:23:52 2012 -0800 Add row-reencoding support to VBV for improved accuracy Extremely accurate, possibly 100% so (I can't get it to fail even with difficult VBVs). Does not yet support rows split on slice boundaries (occurs often with slice-max-size/mbs). Still inaccurate with sliced threads, but better than before. commit bc473ddfd2f5925715d2895da666e214ebf04c84 [revision 2177] Author: Fiona Glaser Date: Thu Feb 9 12:38:44 2012 -0800 Abstract bitstream backup/restore functions Required for row re-encoding. commit 48e8e52e740fdc7ddca792d4afe240a213f66df5 [revision 2176] Author: Anton Mitrofanov Date: Thu Feb 9 15:27:53 2012 -0800 Add an small per-MB cost penalty for lowres Helps avoid VBV predictors going nuts with very low-cost MBs. One particular case this fixes is zero-cost MBs: adaptive quantization decreases the QP a lot, but (before this patch), no cost penalty gets factored in for this, because anything times zero is zero. commit 1b31a10c7c3210d5eb14d522aaa0cfbe0e7a25e8 [revision 2175] Author: Fiona Glaser Date: Mon Feb 13 18:31:51 2012 -0800 Remove explicit run calculation from coeff_level_run Not necessary with the CAVLC lookup table for zero run codes. commit 9da19fbee621ca5b052891b3c010f8bc89b2ba93 [revision 2174] Author: Fiona Glaser Date: Mon Feb 13 13:20:06 2012 -0800 Export PSNR/SSIM in x264 API commit 3a5f2fe30aeb5314b74f83b1960e9a40776347e9 [revision 2173] Author: Ronald S. Bultje Date: Wed Feb 8 13:10:31 2012 -0800 x86inc: support yasm -f win64 Not necessary for x264, as -m amd64 already does the right thing, but used by external users of x86inc. commit 3131a19cabcdca221ce4cd61a3cff68d99f1a517 [revision 2172] Author: Henrik Gramner Date: Wed Feb 1 23:52:48 2012 +0100 Fix incorrect zero-extension assumptions in x86_64 asm Some x264 asm assumed that the high 32 bits of registers containing "int" values would be zero. This is almost always the case, and it seems to work with gcc, but it is *not* guaranteed by the ABI. As a result, it breaks with some other compilers, like Clang, that take advantage of this in optimizations. Accordingly, fix all x86 code by using intptr_t instead of int or using movsxd where neccessary. Also add checkasm hack to detect when assembly functions incorrectly assumes that 32-bit integers are zero-extended to 64-bit. commit d52d0b1e6a9323911818c2a89764f6827974e0f7 [revision 2171] Author: Fiona Glaser Date: Thu Feb 23 09:11:23 2012 -0800 Fix possible alignment crash when linking from MSVC x264_cavlc_init needs to be stack-aligned now. commit 0a369502ab83c32ccebdb1888e6981ef872baaf0 [revision 2170] Author: Anton Mitrofanov Date: Tue Feb 21 12:58:22 2012 -0800 Fix rare overflow in 10-bit intra_satd_x3_16x16 asm commit 38a26cdfc54ffd60c90651f3b96490d772e6dd73 [revision 2169] Author: Steven Walters Date: Sat Feb 11 22:56:43 2012 -0500 ICL: fix out of tree building and resource file usage on Windows commit 10e1ba55803970ecd240f2057e7dfe0c22fc8efb [revision 2168] Author: Oka Motofumi Date: Mon Feb 6 06:07:34 2012 +0900 Add error handling for out-of-tree build commit 0fc5acc6e6c038f6380f614e4dc4e1893b716b7e [revision 2167] Author: Anton Mitrofanov Date: Tue Mar 6 17:34:02 2012 +0400 Fix RGB colorspace input BGR/BGRA input was correct. commit 282c3cfb22f4ab526d96678249ccdc7f16531811 [revision 2166] Author: Fiona Glaser Date: Mon Feb 13 16:40:32 2012 -0800 Fix interlaced + extremal slice-max-size Broke if the first macroblock in the slice exceeded the set slice-max-size. commit a37a42450cdc31393dae56aed5a726a42fd540d6 [revision 2165] Author: Henrik Gramner Date: Sun Feb 5 20:43:09 2012 +0100 Fix regression in r2141 Broke register preservation in x264_cpu_cpuid and x264_cpu_xgetbv. Did not cause any problems. commit ae289e6f03b76afa8736806e683349e8e59fcc93 [revision 2164] Author: Fiona Glaser Date: Thu Jan 19 14:56:54 2012 -0800 TBM, AVX2, FMA3, BMI1, and BMI2 CPU detection support TBM and BMI1 are supported by Trinity/Piledriver. The others (and BMI1) will probably appear in Intel's upcoming Haswell. Also update x86inc with AVX2 stuff. commit e0581e0878c1995b215c51691af6bdf7a386946f [revision 2163] Author: Loren Merritt Date: Fri Feb 3 06:27:18 2012 +0000 x86inc: add TAIL_CALL macro to abstract a common asm idiom commit 04c38190c60658d544801718fc38fa3f745381d9 [revision 2162] Author: Fiona Glaser Date: Wed Jan 25 16:44:38 2012 -0800 Minor asm optimizations/cleanup commit 6d7c5efcf6f8751f768177bf828973a7bd4fdcf6 [revision 2161] Author: Fiona Glaser Date: Tue Jan 24 19:03:58 2012 -0800 Clean up and optimize weightp, plus enable SSSE3 weight on SB/BDZ Also remove unused AVX cruft. commit 047175e610d3d5360f69e4f8168ff6fbafda2465 [revision 2160] Author: Fiona Glaser Date: Mon Jan 23 18:57:58 2012 -0800 XOP frame_init_lowres Covers both 8-bit and 16-bit, ~5-10% faster on Bulldozer. commit abc88d60b5e0d803d6d4f0a5d9ece7dd0bdde0f1 [revision 2159] Author: Fiona Glaser Date: Tue Jan 17 15:25:10 2012 -0800 XOP 8x8 zigzags Field: 35(mmx) ->16(xop) cycles Frame: 32(ssse3)->20(xop) cycles commit aa47955a0ec65218e8bb967d36689069baca5fd1 [revision 2158] Author: Fiona Glaser Date: Mon Jan 23 15:09:38 2012 -0800 AVX 32-bit hpel_filter_h Faster on Sandy Bridge. Also add details on unsuccessful optimizations in these functions. commit d7407cf81816fff7ab32ceb2398575724e8cc737 [revision 2157] Author: Fiona Glaser Date: Fri Jan 27 16:29:30 2012 -0800 x86inc: add high halfword register support Might be useful in a few cases. commit acabceb6530d1858bcd009b055e217c75344c442 [revision 2156] Author: Ronald S. Bultje Date: Wed Jan 25 13:53:59 2012 +0800 Change %ifdef directives to %if directives in *.asm files This allows combining multiple conditionals in a single statement. commit 82d8cdde567b1c1e8d2046bbb831d0daafe8213b [revision 2155] Author: Anton Mitrofanov Date: Sun Jan 22 22:13:52 2012 +0400 Use TV range algorithm for bit-depth conversions Such sources are more common, so better to be correct for the common case. This also produces less error for the case of full range than the previous algorithm produced for the case of TV range. commit 27a7b05b8330d0756e5e3f6669282561030f54fa [revision 2154] Author: Hii Date: Wed Jan 25 16:29:22 2012 +0800 Bump dates to 2012 commit 762f677e095a40e1927086bb08799c01e05c2ee4 [revision 2153] Author: Henrik Gramner Date: Sat Jan 28 21:38:27 2012 +0100 Add Windows resource file Displays version info in Windows Explorer. commit 545b41caa5903ebcb5d9336a59f9bf5a50a45037 [revision 2152] Author: Sergey Radionov Date: Mon Jan 16 13:22:44 2012 -0800 Fix win32 pthread_cond_signal Isn't used by x264 currently, so didn't cause a problem. Fix backported from libav. commit 697a11e8ecb1376cddd4a8d4f4fa693e41c1987e [revision 2151] Author: Mans Rullgard Date: Wed Feb 1 15:55:25 2012 -0800 ARM: align asm functions to 4 bytes. Some linkers apparently fail to correctly align ARM functions when mixing with Thumb code. commit f59b310fd87b643b59d6e109e49fdf9d0a04ce91 [revision 2150] Author: Anton Mitrofanov Date: Sun Jan 22 13:00:23 2012 +0400 Fix normalization of colorspace when input is packed YUV 4:2:2 commit 9fb055856a617f5ddca15a0c5745ff1c1486ad9a [revision 2149] Author: Fiona Glaser Date: Sat Jan 21 12:54:40 2012 -0800 Force keyint-min 1 with Blu-ray Fixes an issue with referencing across I-frames that's prohibited in Blu-ray for some godforsaken reason. commit 77cfcb6acf648da00eb4ddb52bcb7006bc64a61a [revision 2148] Author: Oka Motofumi Date: Sun Jan 29 20:34:41 2012 +0900 Fix crash in --demuxer y4m with unsupported colorspace commit 30829c0c7e6bbf40d1b3ed5fcb5a45d85407978f [revision 2147] Author: Anton Mitrofanov Date: Mon Jan 16 14:02:53 2012 -0800 Fix overread/possible crash with intra refresh + VBV commit 26c8303472b837e301d789ba569eae01955cf7f6 [revision 2146] Author: Loren Merritt Date: Wed Jan 18 15:47:07 2012 -0800 Fix trellis 2 + subme >= 8 Trellis didn't return a boolean value as it was supposed to. Regression in r2143-5. commit 7d804baf3bca6ad33e18ccd0a838274214a8a7a0 [revision 2145] Author: Loren Merritt Date: Fri Jan 6 15:53:29 2012 +0000 CABAC trellis opts part 4: x86_64 asm Another 20% faster. 18k->12k codesize. This patch series may have a large impact on encoding speed. For example, 24% faster at --preset slower --crf 23 with 720p parkjoy. Overall speed increase is proportional to the cost of trellis (which is proportional to bitrate, and much more with --trellis 2). commit dd354db4db2f26e63ed36eb790052c6794e5a684 [revision 2144] Author: Loren Merritt Date: Fri Jan 6 15:53:04 2012 +0000 CABAC trellis opts part 3: make some arrays non-static commit 4abcf60a04e358b87da284f3a5fac3e2949b6de1 [revision 2143] Author: Loren Merritt Date: Thu Dec 22 17:56:06 2011 +0000 CABAC trellis opts part 2: C optimizations Hoist the branch on coef value out of the loop over node contexts. Special cases for each possible coef value (0,1,n). Special case for dc-only blocks. Template the main loop for two common subsets of nodes, to avoid a bunch of branches about which nodes are live. Use the nonupdating version of cabac_size_decision in more cases, and omit those bins from the node struct. CABAC offsets are now compile-time constants. Change TRELLIS_SCORE_MAX from a specific constant to anything negative, which is cheaper to test. Remove dct_weight2_zigzag[], since trellis has to lookup zigzag[] anyway. 60% faster on x86_64. 25k->18k codesize. commit 253cd7baefb7f5d101725034b2c37afacc012305 [revision 2142] Author: Loren Merritt Date: Thu Dec 22 17:55:06 2011 +0000 CABAC trellis opts part 1: minor change in output Due to different tie-break order. commit 0d7a9100d12c618acea3f01b8bb9cc306f475b47 [revision 2141] Author: Henrik Gramner Date: Sun Jan 8 04:14:10 2012 +0100 x86inc improvements for 64-bit Add support for all x86-64 registers Prefer caller-saved register over callee-saved on WIN64 Support up to 15 function arguments commit 8a6a062e11d4074c081d076408cb0bd6def1af8e [revision 2140] Author: Ilia Valiakhmetov Date: Sun Jan 15 04:47:58 2012 -0600 High bit depth SSE2/AVX add8x8_idct8 and add16x16_idct8 From Google Code-In. commit a35fd4194dd7004abe6f66679496beded405515a [revision 2139] Author: Edward Wang Date: Wed Jan 4 15:35:54 2012 -0800 MMX/SSE2/AVX predict_8x16_p, high bit depth fdct8 From Google Code-In. commit 9301bbd39fb0a49b1e986f9a7c29685439686de4 [revision 2138] Author: Fiona Glaser Date: Thu Dec 22 14:03:15 2011 -0800 XOP 8-bit fDCT Use integer MAC for one of the SUMSUB passes. About a dozen cycles faster for 16x16. commit c83edc0427e78c58683af99b80e0234c77b3e41a [revision 2137] Author: Cristian Militaru Date: Wed Jan 4 12:38:08 2012 -0800 High bit depth intra_sad_x3_4x4 From Google Code-In. commit 9c0fa2d63f549a44f869562cffa9c041a32ae41d [revision 2136] Author: Fiona Glaser Date: Thu Dec 8 13:45:41 2011 -0800 Use a large LUT for CAVLC zero-run bit codes Helps the most with trellis and RD, but also helps with bitstream writing. Seems at worst neutral even in the extreme case of a CPU with small L2 cache (e.g. ARM Cortex A8). commit de7aed78cd2f70017f3c479d8f8dc32d52cee607 [revision 2135] Author: Matt Habel Date: Fri Dec 16 23:16:09 2011 -0800 High bit depth intra_sad_x3_8x8, intra_satd_x3_4x4/8x8c/16x16 Also add an ACCUM macro to handle accumulator-induced add-or-swap more concisely. commit d9dee734a9af1788461def43321f19be6a3d2d72 [revision 2134] Author: Shitiz Garg Date: Sat Dec 3 15:34:57 2011 -0800 MMX 10-bit predict_8x8c_h and predict_8x16c_h From Google Code-In. commit 7496fc4aeaaaf5b470b1eb0f73ce8ea71d0116f2 [revision 2133] Author: Aaron Schmitz Date: Wed Nov 30 00:15:45 2011 -0600 Some MBAFF x86 assembly functions. deblock_chroma_420_mbaff, plus 422/422_intra_mbaff implemented using existing functions. From Google Code-In. commit b8d7b8acb48b45afbfd7efb5baac79475682684a [revision 2132] Author: George Stephanos Date: Thu Dec 1 16:53:45 2011 -0800 More ARM NEON assembly functions predict_8x8_v, predict_4x4_dc_top, predict_8x8_ddl, predict_8x8_ddr, predict_8x8_vl, predict_8x8_vr, predict_8x8_hd, predict_8x8_hu. From Google Code-In. commit e269ca55e5244280afd0347c1088083cf7043d48 [revision 2131] Author: Ilia Date: Mon Nov 28 05:20:09 2011 -0800 More 4:2:2 asm functions High bit depth version of deblock_h_chroma_422. Regular and high bit depth versions of deblock_h_chroma_intra_422. High bit depth pixel_vsad. SSE2 high bit depth and MMX 8-bit predict_8x8_vl. Our first GCI patch this year! commit 5d66c5011488539f99ceafdb47b0856a8e9dae0b [revision 2130] Author: Henrik Gramner Date: Thu Dec 8 16:14:35 2011 +0100 SSE2 and SSSE3 versions of sub8x16_dct_dc Also slightly faster sub8x8_dct_dc commit 3ea6a8b22e0aa89e3749e9c95edfeaad9d341b7e [revision 2129] Author: Steven Walters Date: Mon Dec 5 08:46:34 2011 -0500 Resize filter updates Use AVPixFmtDescriptors to pick the most compatible x264 csp for any pixel format. Fix deprecated use of av_set_int. Now requires libavutil >= 51.19.0 commit f71d047d0bc129eb9f4724e023bf888a9124338b [revision 2128] Author: Oka Motofumi Date: Thu Jan 5 14:23:50 2012 -0800 Add out-of-tree build support commit 5539220e5afc641a6747c6d95f41e5efbe5858e1 [revision 2127] Author: Anton Mitrofanov Date: Fri Dec 16 18:17:00 2011 +0400 Limit SSIM to 100db Avoids floating point error for infinite SSIM (lossless). commit 13c236172f0ff40ca149a2e862498457cd32ccb9 [revision 2126] Author: Reynaldo H. Verdejo Pinochet Date: Wed Jan 4 13:16:12 2012 -0300 Fix wrong conditional inclusion of inttypes.h inttypes.h is required by encoder/ratecontrol.c for SCNxxx macros, and HAVE_STDINT_H does not imply having inttypes.h. stdint.h is a subset of inttypes.h, but this isn't enough for x264. This change fixes building x264 with Android's toolchain. commit 2df9d45db64110854e6da6a2037d6c432c5463fe [revision 2125] Author: Anton Mitrofanov Date: Wed Dec 21 11:08:56 2011 +0400 Fix crash with sliced threads and input height <= 112 commit e3d311813f3931133962f7ab8ee2305d231df83d [revision 2124] Author: Phillip Blucas Date: Mon Dec 19 17:43:41 2011 -0600 Fix loading custom 8x8 chroma quant matrices in 4:4:4 commit 9fd7ccb2b635276d019e137844c693b525f92244 [revision 2123] Author: Anton Mitrofanov Date: Fri Dec 16 01:48:07 2011 +0400 Fix PCM cost overflow commit 1d70d0e56003b762439ad4b5d8e72729b51516ae [revision 2122] Author: Anton Mitrofanov Date: Fri Dec 9 01:54:22 2011 +0400 Fix overflow in 8-bit x86 vsad asm function commit b6ce6c64c17071804676435da9b1c07b902857e3 [revision 2121] Author: Anton Mitrofanov Date: Wed Dec 7 19:14:52 2011 +0400 Fix crash in --fullhelp when compiled against recent ffmpeg Don't assume all pixel formats have a description. commit c3ba63bbe83bd20d06a64cfecd6b878e8f49bc13 [revision 2120] Author: Fiona Glaser Date: Tue Dec 6 14:39:21 2011 -0800 Fix regression in r2118 Broke trellis with i16x16 macroblocks. commit 9dc2391576b35acb55c04773049a0b817f306969 [revision 2119] Author: Fiona Glaser Date: Wed Nov 30 13:02:12 2011 -0800 Modify MBAFF chroma deblock functions to handle U/V at the same time Allows for more convenient asm implementations. commit d0bf649fcc1a79da12e220c4364aeca6045dfbed [revision 2118] Author: Fiona Glaser Date: Thu Nov 10 16:16:13 2011 -0800 CABAC trellis optimizations: use SIMD quant Significant speed increase, minor change in output due to rounding. commit 6767f967831048669e45e65681f37011483b4fa0 [revision 2117] Author: Steven Walters Date: Sun Nov 6 09:48:30 2011 -0800 YUV range detection and support for x264CLI Two new options: --input-range and --range. --input-range forces the range of the input in case of misdetection; auto by default. -- range sets the range of the output; x264cli will convert if necessary, TV by default. --fullrange is now removed as a CLI option (but the libx264 API is unchanged). commit f9a4c4d9828c1cc60135d0301981ea71fd90f6ca [revision 2116] Author: Kieran Kunhya Date: Fri Nov 4 20:09:13 2011 +0000 Pass through user data commit 1c774e936a315fdfb92a35c402b351a1c542a13a [revision 2115] Author: Fiona Glaser Date: Thu Oct 27 14:05:56 2011 -0700 Remove unpredictable branch in CABAC dqp commit f3a7517cb9b06a580623cbea0f140be534b99877 [revision 2114] Author: Loren Merritt Date: Sun Oct 23 23:15:11 2011 +0000 x86inc: AVX symmetry optimization 3-arg AVX ops with a memory arg can only have it in src2, whereas SSE emulation of 3-arg prefers to have it in src1 (i.e. the move). So, if the op is symmetric and the wrong one is memory, swap them. Eliminates redundant moves in some cases when using 3-operand without AVX with memory arguments. Also fix movss and movsd in some cases, and flag shufps correctly as float. commit 5ebbcd8748ae8d8b184db5a8f9b46a9ad865f0ae [revision 2113] Author: Anton Mitrofanov Date: Tue Nov 29 13:45:13 2011 -0800 checkasm: shut up gcc warnings, fix some naming of functions in results commit 561f71ebf741370075b970fb9d31a593cf47782f [revision 2112] Author: Mans Rullgard Date: Mon Nov 28 16:29:12 2011 -0800 checkasm: fix build on ARM Because of how ALIGNED_ARRAY_16 is defined on ARM, array initialisers cannot be used here. Use memset() instead. commit 24bf90abde21e77c574f2bd43e38a3222c3183ef [revision 2111] Author: Anton Mitrofanov Date: Sat Nov 12 01:31:49 2011 +0400 Improve makefile rules Remove the need for "make clean" after most reconfigures. commit 87b23e25eee0c04bb47957445e7cf941a7d8b980 [revision 2110] Author: Anton Mitrofanov Date: Sat Nov 12 00:47:48 2011 +0400 Mark some local functions as static, cosmetics commit 2ecbcd73d60d2f749696b39627c91e28a396538b [revision 2109] Author: Anton Mitrofanov Date: Fri Nov 11 23:19:02 2011 +0400 Fix crash if timecode file opening fails commit f1387840b98560ae34aea9ca09d55984812ad50b [revision 2108] Author: Fabian Greffrath Date: Fri Nov 11 13:25:43 2011 -0800 Configure: force PIC for shared build on PARISC and MIPS commit e5063ab30bcb79f94774b6d9ce91b098ade01d6d [revision 2107] Author: Anton Mitrofanov Date: Sat Oct 22 19:41:07 2011 +0400 Improve yasm version check Previous check allowed certain earlier versions that weren't fully compatible. commit 12104b22820b38b4976e83a6ee00dcb59ed959f1 [revision 2106] Author: Fiona Glaser Date: Tue Oct 18 14:30:26 2011 -0700 Add fenc prefetching to adaptive quant Many fewer cache misses, faster adaptive quant. commit 9bbfc30284469a70374a75fecfa322c4740dc2b7 [revision 2105] Author: Fiona Glaser Date: Tue Oct 18 14:14:03 2011 -0700 Split prefetch_fenc between colorspaces Add 4:2:2 version. commit b63a73da3add660358a4bad1a590c2d4ed466dc4 [revision 2104] Author: Fiona Glaser Date: Tue Oct 11 17:04:32 2011 -0700 Some more 4:2:2 x86 asm coeff_last8, coeff_level_run8, var2_8x16, predict_8x16c_dc, satd_4x16, intra_mbcmp_8x16c_x3, deblock_h_chroma_422 commit 50aaf8d84ac6fc78794b98cfe6a25440a09fbb82 [revision 2103] Author: Loren Merritt Date: Tue Oct 11 18:12:43 2011 +0000 Remove obsolete versions of intra_mbcmp_x3 intra_mbcmp_x3 is unnecessary if x9 exists (SSSE3 and onwards). commit 1111780d8e392455870898bacae30a413ae98464 [revision 2102] Author: Loren Merritt Date: Mon Oct 10 05:42:36 2011 +0000 SSSE3/SSE4/AVX 9-way fully merged i8x8 analysis (sa8d_x9) x86_64 only for now, due to register requirements (like sa8d_x3). i8x8 analysis cycles (per partition): penryn sandybridge bulldozer 616->600 482->374 418->356 preset=faster 892->632 725->387 598->373 preset=medium 948->650 789->409 673->383 preset=slower commit 422979198e492d5068034a3a5b1e4991af2b63a1 [revision 2101] Author: Fiona Glaser Date: Fri Sep 30 19:09:19 2011 -0700 SSSE3/SSE4/AVX 9-way fully merged i8x8 analysis (sad_x9) ~3 times faster than current analysis, plus (like intra_sad_x9_4x4) analyzes all modes without shortcuts. commit da66eef02e8d9cb57c52aeecb7371b9968747c2b [revision 2100] Author: Loren Merritt Date: Wed Oct 5 13:29:21 2011 -0700 Merge i4x4 prediction with intra_mbcmp_x9_4x4 Avoids a redundant prediction after analysis. commit 9f027f4f3f9b03b5dabe081a12ca1b80c20ffc18 [revision 2099] Author: Fiona Glaser Date: Wed Oct 5 13:17:31 2011 -0700 Inline i4x4/i8x8 encode into intra analysis Larger code size, but faster. commit a5a6d0eeadbba6ae3232f620345762aebca240ab [revision 2098] Author: Fiona Glaser Date: Wed Sep 21 17:12:10 2011 -0700 Initial XOP and FMA4 support on AMD Bulldozer ~10% faster Hadamard functions (SATD/SA8D/hadamard_ac) plus other improvements. commit e73b85b56437827f881d1406e11d2cca4bbe5583 [revision 2097] Author: Mans Rullgard Date: Tue Sep 27 21:14:14 2011 +0400 ARM: update NEON chroma deblock functions to NV12 pixel format commit 9c356e2558948714bdbb991a9f9cb9a3e1f0121b [revision 2096] Author: Sean McGovern Date: Mon Oct 17 12:45:15 2011 -0700 Add /usr/lib/{64/}values-xpg6.o to $LDFLAGS on Solaris This is required for POSIX.1-2001 compliance. commit 6c50ab569d95ebb07e5fb437a38d646bf607c74b [revision 2095] Author: Sean McGovern Date: Mon Oct 17 12:44:03 2011 -0700 Fix linker test for -Bsymbolic The Solaris linker only accepts -Bsymbolic for objects compiled in dynamic mode (i.e. shared objects), so pass -shared to gcc. Additionally, for x86_32 unresolved textrels cause a linker error so mark the .text section as 'impure'. commit 421c38f22c7bdaf2981b2ffb72332c40cadd7332 [revision 2094] Author: Sean McGovern Date: Mon Oct 17 12:43:28 2011 -0700 Add $SOFLAGS to exported SOFLAGS make variable commit dd713cae59c062440b046fe75d60af83d049de3c [revision 2093] Author: Henrik Gramner Date: Sat Sep 24 15:56:08 2011 +0200 Allow setting a chroma format at compile time Gives a slight speed increase and significant binary size reduction when only one chroma format is needed. commit 68f6db44035e8f9d4d00a73e5703eb1d7ff8d619 [revision 2092] Author: Harfe Leier Date: Fri Sep 30 12:49:33 2011 -0700 Improve profile help List high422/high444 profiles, and don't show non-high-bit-depth profiles in high bit depth builds. commit 675110a687459cc03685489470bbc730580a793b [revision 2091] Author: Yusuke Nakamura Date: Thu Oct 20 03:09:51 2011 +0900 Fix infinite loop parsing TDecimate Mode 3 timecode v1 files commit 2ec99b3b94f986b456de1525087ee85b6fa79091 [revision 2090] Author: Fiona Glaser Date: Mon Oct 10 17:44:31 2011 -0700 Fix some integer overflows/signedness errors found by IOC The only real bug here is in slicetype.c, which may or may not affect real encodes. commit ae1288c43780ed9be60b59dd556d5f85df7498e2 [revision 2089] Author: Fiona Glaser Date: Wed Oct 12 09:16:32 2011 -0700 Fix pixel_var2 with 4:2:2 encoding Might have caused artifacts or suboptimal chroma compression. commit 9ac39f6078659f4f5cf548460dec940a04fd52c8 [revision 2088] Author: Anton Mitrofanov Date: Sun Oct 9 19:14:16 2011 +0400 Fix chroma intra analysis in 4:4:4 lossless mode commit 294df95060118de1d605ce20fcf490cdb4f4d14c [revision 2087] Author: Anton Mitrofanov Date: Sun Oct 9 01:13:29 2011 +0400 Fix use of uninitialized MVs in sub8x8 RDO commit 3ff2feee5a176ec8012c313e4a9e2b3611f29614 [revision 2086] Author: Fabian Greffrath Date: Fri Oct 7 19:04:17 2011 -0700 Fix detection of Alpha CPU arch on alphaev67 commit 2701440c515a9a8aee1c87d7c06c98e43c3d813f [revision 2085] Author: Fiona Glaser Date: Wed Sep 14 14:53:04 2011 -0700 Optimize x86 asm for Intel macro-op fusion That is, place all loop counter tests right before their conditional jumps. commit 2d481bc0ee053634fe46c0df2cbc646733dd137d [revision 2084] Author: Fiona Glaser Date: Mon Sep 12 11:51:23 2011 -0700 CAVLC: clean up and restructure Somewhat faster CAVLC and RD bit-counting. commit da768d95d5d63f1eac77a35731079ce02aaa125c [revision 2083] Author: Fiona Glaser Date: Thu Sep 8 17:27:02 2011 -0700 CABAC: clean up and restructure Somewhat faster CABAC and RD bit-counting. commit 389b401a99f2f33b41db7d74904b3ff7509d79e5 [revision 2082] Author: Fiona Glaser Date: Sun Sep 4 11:31:29 2011 +0200 Some initial 4:2:2 x86 asm commit 5b0cb86f27ba0c5433c404bed51c06a5124dfb49 [revision 2081] Author: Henrik Gramner Date: Fri Aug 26 15:57:04 2011 +0200 4:2:2 encoding support commit 3d82e875d06b9d1e15ca2baa16b1bd9640500972 [revision 2080] Author: Loren Merritt Date: Mon Aug 15 18:18:55 2011 +0000 SSSE3/SSE4 9-way fully merged i4x4 analysis (sad/satd_x9) i4x4 analysis cycles (per partition): penryn sandybridge 184-> 75 157-> 54 preset=superfast (sad) 281->165 225->124 preset=faster (satd with early termination) 332->165 263->124 preset=medium 379->165 297->124 preset=slower (satd without early termination) This is the first code in x264 that intentionally produces different behavior on different cpus: satd_x9 is implemented only on ssse3+ and checks all intra directions, whereas the old code (on fast presets) may early terminate after checking only some of them. There is no systematic difference on slow presets, though they still occasionally disagree about tiebreaks. For ease of debugging, add an option "--cpu-independent" to disable satd_x9 and any analogous future code. commit e184ff26233198932d9b77aa7feed6a49095f136 [revision 2079] Author: Loren Merritt Date: Mon Aug 15 17:43:42 2011 +0000 Faster intra_mbcmp_x3 for versions without dedicated asm Select asm subroutines more intelligently in the wrapper functions. commit d94edd734304c16265f28dd11e8a2029cbdc5b7f [revision 2078] Author: Loren Merritt Date: Sat Aug 13 19:01:22 2011 +0000 Optimize x86 intra_predict_4x4 and 8x8 High bit depth Penryn, Sandybridge cycles: 4x4_ddl: 11->10, 9-> 8 4x4_ddr: 15->13, 12->11 4x4_hd: , 15->12 4x4_hu: , 14->13 4x4_vr: 15->14, 14->12 8x8_ddl: 32->19, 19->14 8x8_ddr: 42->19, 21->14 8x8_hd: , 15->13 8x8_hu: 21->17, 16->12 8x8_vr: 33->19, 8-bit Penryn, Sandybridge cycles: 4x4_ddr: 24->15, 4x4_hd: 24->16, 4x4_hu: 23->15, 4x4_vr: 23->16, 4x4_vl: 10-> 9, 8x8_ddl: 23->15, 8x8_hd: , 17->14 8x8_hu: , 15->14 8x8_vr: 20->16, 17->13 commit 37b2d963b262d2880271f313a17fceeee27a3de8 [revision 2077] Author: Loren Merritt Date: Sat Aug 13 06:44:28 2011 +0000 Use realistic alignment for intra pred benchmarks in checkasm commit 10ef9590e33d209a937fcb3f5ca1be66fb481a17 [revision 2076] Author: Yusuke Nakamura Date: Wed Sep 21 01:15:38 2011 +0900 Fix frame packing SEI with --frame-packing 0 According to the spec, when frame_packing_arrangement_type is equal to 0, quincunx_sampling_flag shall be equal to 1. commit cb648060484f081eba39480a26791a8e0d605989 [revision 2075] Author: Oka Motofumi Date: Mon Sep 5 11:50:37 2011 +0900 Fix install/uninstall shared libs if SYS is WINDOWS/CYGWIN commit d2452266ccf4bd9552d7ac94b5600b416d757d34 [revision 2074] Author: Reinhard Tartler Date: Wed Aug 10 00:16:46 2011 -0700 Add Hurd support to configure commit 75de7be6d5e7b0e1fc0febf087be65e91c00b80b [revision 2073] Author: Loren Merritt Date: Sat Aug 13 00:39:35 2011 +0000 Optimize x86 intra_satd_x3_* ~7% faster. commit b597966bfa8a481489e5af93eb25988456c51a5d [revision 2072] Author: Loren Merritt Date: Fri Aug 12 19:13:07 2011 +0000 Optimize x86 intra_sa8d_x3_8x8 ~40% faster. Also some other minor asm cosmetics. commit f3fc0c4485aa3ed1774bce462ad3fb92faec114b [revision 2071] Author: Loren Merritt Date: Fri Aug 12 02:15:46 2011 +0000 Scale interlaced refs/mvs for mvr predictors Slightly improves compression and fixes a Valgrind error. commit ebc334f8d1d2752b9bc2c56e457fffc123ffddee [revision 2070] Author: Loren Merritt Date: Thu Aug 11 15:03:12 2011 +0000 Optimize predict_8x8_filter and incidentally remove a valgrind false-positive commit 94493149bbc251d0ce4ceee85a9faa5ad8837a04 [revision 2069] Author: Anton Mitrofanov Date: Mon Aug 15 12:22:18 2011 +0400 Don't override flat SSE2 dequant functions with non-flat AVX ones Slightly faster. commit 25a8bb9461bf7b0c75e7fadc8d104dbdc61bed5c [revision 2068] Author: Loren Merritt Date: Mon Aug 8 13:40:53 2011 +0000 Shut up some valgrind false-positives commit ede9651875846116bdb2d3d84e3630beada7e21d [revision 2067] Author: Fiona Glaser Date: Tue Aug 16 13:02:24 2011 -0700 Avoid some unnecessary allocations with B-frames/CABAC off commit 17f16d161e386457f7eaa96866550c497af681d5 [revision 2066] Author: Fiona Glaser Date: Mon Aug 22 17:07:03 2011 -0700 Fix typo in p8x8 RD analysis Passed wrong idx to trellis. commit 5a22495a2a857b9fcd5825595422c78f0223a417 [revision 2065] Author: Anton Mitrofanov Date: Sun Aug 21 02:44:45 2011 +0400 Fix invalid memory accesses in x86 lowres_init when width <= 16 commit 8b72a9e4c87bbdfa1b87609fa9cde9bf61440383 [revision 2064] Author: Anton Mitrofanov Date: Mon Aug 15 12:03:09 2011 +0400 Fix intermediate conversion for YUVJ* pixfmts with 4:4:4 encoding commit cec1f4039fb6f4bf1c5ef97648b94e489400e5bc [revision 2063] Author: Henrik Gramner Date: Sun Aug 14 13:39:29 2011 +0200 Fix pic_out returned by x264_encoder_encode with 4:4:4 commit eaa858d33b9dcb6e526b01cc39d0268d4ae6d8c0 [revision 2062] Author: Loren Merritt Date: Thu Aug 11 22:12:26 2011 +0000 Fix zeroing of mvr predictors in bskip blocks commit 29e318fd26bd3a2e689801aeb9ff84d9e6c1d25f [revision 2061] Author: Loren Merritt Date: Thu Aug 11 01:33:13 2011 +0000 Fix: chroma planes for weightp analysis were not initted if U early-terminates and V doesn't. commit af0d8d8588e9eed4c4895747fcb7485dd0210bcf [revision 2060] Author: Henrik Gramner Date: Wed Aug 10 20:25:07 2011 +0200 Expand borders before chroma weightp analysis Prevents mc from using uninitialized source pixels. commit cfcce49df42848f601cb05086d1ef89c23675397 [revision 2059] Author: Henrik Gramner Date: Wed Aug 10 19:29:14 2011 +0200 Another 4:4:4 chroma weightp bug fix commit 51821635e8dccf877c3521830a8a5598c2bc408b [revision 2058] Author: Fiona Glaser Date: Wed Aug 10 00:17:26 2011 -0700 Fix typo in help commit 3817e54a3aeaa387206f78d5eaee98339dd7d93b [revision 2057] Author: Fiona Glaser Date: Sat Aug 6 10:45:47 2011 -0700 Improve support for varying resolution between passes Should give much better quality, but still doesn't support MB-tree yet. Also check for the same interlaced options between passes. Various minor ratecontrol cosmetics. commit 9b9a13a98b98385884b7ac25710305ad431c62e4 [revision 2056] Author: Loren Merritt Date: Sun Aug 7 22:57:27 2011 +0000 asm cosmetics: base-4 constants for shuffles commit 7e60fcd7af513e48d912dfce21026420698ed6ba [revision 2055] Author: Loren Merritt Date: Wed Aug 3 14:58:50 2011 +0000 Enable some existing asm functions that were missing function pointers pixel_ads1_avx, predict_8x8_hd_avxx High bit depth mc_copy_w8_sse2, denoise_dct_avx, prefetch_fenc/ref, and several pixel*sse4. commit 52f287e84a9965f652221f535a3298c7ce0846b9 [revision 2054] Author: Loren Merritt Date: Wed Aug 3 14:57:06 2011 +0000 Remove some unused, broken, and/or useless functions Unused frame_sort. Unused x86_64 dequant_4x4dc_mmx2, predict_8x8_vr_mmx2. Unused and broken high_depth integral_init*h_sse4, optimize_chroma_*, dequant_flat_*, sub8x8_dct_dc_*, zigzag_sub_*. Useless high_depth dequant_sse4, dequant_dc_sse4. commit 309ddabbb3fba9ba0a2ae4c23470ec539d052374 [revision 2053] Author: Loren Merritt Date: Wed Aug 3 14:56:27 2011 +0000 asm cosmetics: merge all the variants of ABS macros commit 1921c6824e37bdf5a8436a6cbe36b0d3a8c376b3 [revision 2052] Author: Loren Merritt Date: Wed Aug 3 14:53:29 2011 +0000 asm cosmetics part 2 These changes were split out of the cpuflags commit because they change the output executable. commit f85be1cdbe8d9244c0465df13ed58215a8c673cc [revision 2051] Author: Loren Merritt Date: Wed Aug 3 14:46:41 2011 +0000 asm cosmetics: INIT_MMX/XMM/YMM now support a cpuflags argument Reduces the number of macro args that need to be passed around. Allows multiple implementations of a given macro (e.g. PALIGNR) to check cpuflags at the location where the macro is defined, instead of having to select implementations by %define at toplevel. Remove INIT_AVX, as it's replaced by "INIT_XMM avx". This commit does not change the stripped executable. commit 67336688cdc0c47468cef4e6f8cf57ffd010b56e [revision 2050] Author: Loren Merritt Date: Wed Aug 3 14:43:34 2011 +0000 Import x86inc.asm patches from libav commit 189c30d390d08b2b3d3007acd0a106a4e0cd17b2 [revision 2049] Author: Loren Merritt Date: Wed Aug 3 14:42:12 2011 +0000 Cosmetics: s/mmxext/mmx2/ commit b37de18947348199bdc392b38e979f619978126e [revision 2048] Author: Henrik Gramner Date: Sun Aug 7 11:58:36 2011 +0200 Fix two bugs in 4:4:4 chroma weightp analysis Caused slightly worse compression. commit 014f9c8e3fa202f13f926ac037c3a8db494522ea [revision 2047] Author: Loren Merritt Date: Wed Aug 3 14:40:01 2011 +0000 Fix "--asm avx" Previously required "--asm sse2fast,fastshuffle,sse4.2,avx". commit 3674cf4fd338a7894883a0172ec6bde61eac6c25 [revision 2046] Author: Anton Mitrofanov Date: Fri Aug 5 15:59:20 2011 +0400 Re-add support for glibc <2.6, which doesn't have CPU_COUNT commit 1dd4b85fc700db5ec4380e20c2d73882808b3763 [revision 2045] Author: Yasuhiro Ikeda Date: Tue Aug 2 08:59:15 2011 +0900 Avoid using deprecated libavformat functions Replace av_find_stream_info with avformat_find_stream_info. Now requires libavformat 53.3.0 or newer. commit 191b68df93e7ad4096c6aa4df4120dcb0e83dded [revision 2044] Author: Henrik Gramner Date: Wed Jul 27 02:23:12 2011 +0200 Use assembly versions of some deblocking functions in MBAFF commit 459ac481e85833550470d231ae4749a138146614 [revision 2043] Author: Anton Mitrofanov Date: Thu Jul 28 00:26:27 2011 +0400 Move X264_VERSION / X264_POINTVER from config.h to x264_config.h This makes them available to external programs as part of the public API. commit 95f03f9e89c04b29aa4b5ad57fa4869899eedb4c [revision 2042] Author: Henrik Gramner Date: Fri Jul 29 20:15:52 2011 +0200 Fix padding bug in x264_expand_border_mbpair commit eee242c1a64db0c4975eaf9add82565502882293 [revision 2041] Author: Yusuke Nakamura Date: Fri Jul 29 23:39:26 2011 +0900 Timecode parsing: Add missing initialization Fix crash when failed to parse timecode file before malloc pts. Fix detection of user timebase considered to be exceeding H.264 maximum. commit e1ec7c8ae8d865165c802a69387e4d41cb004e43 [revision 2040] Author: Anton Mitrofanov Date: Thu Jul 28 13:37:24 2011 +0400 Fix crash with high bitdepth 4:2:0 input commit 10dc5bb27739fd112f5b94ffb9419fa8781c5bbe [revision 2039] Author: Daniel Kang Date: Tue Jul 26 21:57:39 2011 -0400 x86 asm cosmetics Use FDEC_STRIDEB where appropriate. commit bbfbacc9d3fa89cd922f33feb3924b67fdf31f7b [revision 2038] Author: Fiona Glaser Date: Tue Jul 26 07:40:23 2011 -0700 Fix a bug in lossless sub-8x8 RD Caused crashes in rare cases with lossless encoding. Regression in 4:4:4. commit 10474f5af22f3b2444a4301252175657b6fb1514 [revision 2037] Author: Fiona Glaser Date: Mon Jul 18 23:10:30 2011 -0700 Improved p8x4/4x8 search decision Use the same thresholding as for p16x8/8x16. Does p8x4/4x8 search more often, for a small compression improvement. commit 4a88ee1c649d92bbdbbf128e22d547e9b833f00c [revision 2036] Author: Dan Larkin Date: Wed Jul 13 12:45:23 2011 -0500 Add --subme 11, which disables all early terminations in analysis Necessary for a future trellis mode decision/motion estimation patch. Also add the slowest presets to the regression test. commit 330c8fdaccd63383ba6f7f1ccf787a5f1b89d09b [revision 2035] Author: Dan Larkin Date: Wed Jul 13 11:33:48 2011 -0500 Some trivial changes to RD thresholds The output-changing portion of the next patch. commit b5e21b60fe4422c85b9f27eda6f45d7a5517e160 [revision 2034] Author: Anton Mitrofanov Date: Wed Jul 20 22:54:43 2011 +0400 Allow setting a wider range of chroma QP offsets This allows use of the full range of chroma QP offsets, even in combination with the automatic psy-based adjustments. commit 1f285bd40b45dfa97fadc86f912a19c54563fa77 [revision 2033] Author: Fiona Glaser Date: Fri Jul 15 13:24:38 2011 -0700 Optimize macroblock_deblock_strength, add more early terminations commit 695bac1d7e66ead90952e333abeab0176ea7221d [revision 2032] Author: Fiona Glaser Date: Thu Jul 14 18:23:44 2011 -0700 Function-pointerify MBAFF deblocking functions commit 75466d2e4fff1aeba7e64a1947e8beea3f1235ff [revision 2031] Author: Fiona Glaser Date: Thu Jul 14 14:04:11 2011 -0700 Clean up MBAFF deblocking code commit 8ae69dbc7ec37e157a3890c21ec4904973e800f9 [revision 2030] Author: Fiona Glaser Date: Tue Jul 12 17:27:18 2011 -0700 Optimize frame_deblock_row commit 44269ed290f1a5457c24b6e2992bc65e92a70ac4 [revision 2029] Author: Henrik Gramner Date: Wed Jul 20 22:30:59 2011 +0200 Shrink two arrays commit aea1565f5f5d793935b10cd6081bf8dbe9513db5 [revision 2028] Author: Anton Mitrofanov Date: Mon Jul 18 15:20:05 2011 +0400 Add support for the new (4:4:4) colorspaces to x264_picture_alloc commit e93cfd6adcdd246372a38f2598590c0ab7c69b7d [revision 2027] Author: Anton Mitrofanov Date: Wed Jul 20 18:06:41 2011 +0400 Various cosmetics commit 3ef68d34b477bfd7410267eecbeaa8ebb44bccc4 [revision 2026] Author: Yasuhiro Ikeda Date: Tue Jul 12 23:41:42 2011 +0900 Improve configure help commit 9dd3e96e9420fac3cb00d44eab75450c630fe231 [revision 2025] Author: Yasuhiro Ikeda Date: Tue Jul 12 14:46:29 2011 +0900 Use $optarg for some configure options commit f7e6610ba12319d68833526676b16879aaff415c [revision 2024] Author: Rafaël Carré Date: Thu Jul 14 18:51:43 2011 -0700 Linux x264_cpu_num_processors(): use glibc macros The cpu_set_t structure is considered opaque. Also handle sched_getaffinity() error case if "cpusetsize is smaller than the size of the affinity mask used by the kernel." commit 670d81811866e9e5045d25c5def5ba2b9f06d2ac [revision 2023] Author: Anton Mitrofanov Date: Thu Jul 14 17:02:43 2011 +0400 Fix spurious "stream properties changed" with --seek option on some inputs commit aa50e72e7c723927325d031ab47b24e069dde4e3 [revision 2022] Author: Anton Mitrofanov Date: Fri Jul 15 15:06:37 2011 +0400 Fix use of deprecated libavcodec functions Replace avcodec_open with avcodec_open2. Now requires libavcodec 53.6.0 or newer. commit 67c796a37233e66239226bacd74f038281d43095 [revision 2021] Author: Kieran Kunhya Date: Wed Jul 13 20:25:40 2011 +0100 Fix nalu_process callback with HRD commit bb784df93d92fb28f67a7998faed0da425b25623 [revision 2020] Author: Anton Mitrofanov Date: Wed Jul 13 15:55:38 2011 +0400 Fix incorrect chroma swap for some input pixfmts Problem occurred if pixfmt of lavf/ffms input was PIX_FMT_RGB24 or PIX_FMT_YUV444P. commit ad1c2c8e383cb0f23ba8a0ba2ae211ad9f5eba62 [revision 2019] Author: Anton Mitrofanov Date: Tue Jun 28 21:39:09 2011 +0400 Fix resize filter crash with YUVJ* input pixfmt commit ce55ae08a6aad516e6aa2ed58fd93a2adf39a997 [revision 2018] Author: xvidfan Date: Wed Jun 22 18:46:14 2011 -0700 RGB encoding support Much less efficient than YUV444, but easy to support using the YUV444 framework. commit a93e4c4a75d05e7bf379cb9a39caad57f615eeb0 [revision 2017] Author: Fiona Glaser Date: Wed Jun 22 03:32:53 2011 -0700 4:4:4 encoding support commit 323469e393af71dedd357763883232a293c3ab02 [revision 2016] Author: Fiona Glaser Date: Mon Jun 20 16:20:21 2011 -0700 Properly weight slice header lambda in chroma weightp analysis commit ae61d0c3c236140b6a7ee4ae5f691cf8191b2282 [revision 2015] Author: Daniel Kang Date: Sun Jul 3 17:32:00 2011 -0400 Better x86 high bit depth predict_8x8c_p Avoid the need to check for corner cases by reordering arithmetic. Also make a minor optimization to high bit depth predict_16x16_p. commit a8e1be77d59ff3e5ba565b6ee133a1b2364a2dfa [revision 2014] Author: Fiona Glaser Date: Thu Jun 23 11:54:42 2011 -0700 Eliminate extra layer of indirection for sps/pps references Also remove poc type 1 support (it didn't work anyways) to reduce sps size. commit 8ade503619aff45e5be0ee544d8ab8c867eb5720 [revision 2013] Author: Fiona Glaser Date: Sat Jul 9 19:21:00 2011 -0700 Fix SSIM calculation with sliced threads commit 03bf7da697967bb8ed0b014e8623532b58051240 [revision 2012] Author: Anton Mitrofanov Date: Sat Jul 9 23:57:44 2011 +0400 Avoid possible NaNs in B-frame output stats commit defbf3f4d26d348bf07ec91588a304b59588d96e [revision 2011] Author: Rémi Denis-Courmont Date: Thu Jun 30 14:07:43 2011 -0700 ARM: do not override the toolchain default for FPU ABI commit fb629fcf1d280778f50db39f6c1038158321cc3c [revision 2010] Author: Steven Walters Date: Thu Jun 23 20:29:01 2011 -0400 Fix link errors with libswscale/libavutil as shared libraries commit e825c625999ddc0a27fc6c5cc0b39f198c22b021 [revision 2009] Author: Steven Walters Date: Sat Jun 18 14:12:34 2011 -0400 Fix deprecation in libavformat usage Replace av_open_input_file with avformat_open_input. Now requires libavformat 53.2.0 or newer. commit d89c1b43816f05e43a836d38764d74d499e82a80 [revision 2008] Author: Anton Mitrofanov Date: Thu Jun 9 01:34:14 2011 +0400 Fix various issues with VBV+threads Eliminate the race condition with interframe row predictors and threads. Recalculate frame_size_estimated at the end of a frame, for improved update_vbv_plan. Some cosmetics. commit ed3b10eb9cffcc346b5a070ce47f5a2beaf9efb6 [revision 2007] Author: Anton Mitrofanov Date: Mon Jun 6 13:54:44 2011 +0400 Fix MBAFF row VBV ratecontrol Reverts most of r1984 and implements a much simpler solution. commit d091d0e6038e770ada1a856c601c401ba729d083 [revision 2006] Author: Fiona Glaser Date: Mon May 23 17:01:02 2011 -0700 Make ratecontrol_mb less slow commit 63eb8bc9b48564f777e98dd2528c07cff09184b1 [revision 2005] Author: Steven Walters Date: Thu Jun 2 21:23:04 2011 -0400 Resize filter updates Fix use of deprecated sws_getContext. Fix uses of sws_format_name. Fix stream change warning not occurring on the first resolution change. Drop cpu detection, as it is now performed internally by swscale. Update swscale version requirements. commit d2e8686121a0418f466a0d79ef6a5367e944f940 [revision 2004] Author: Fiona Glaser Date: Tue May 17 14:50:51 2011 -0700 AVX mbtree_propagate Up to ~20-30% faster than SSE2 on Sandy Bridge. commit 6d2b51a32bbaabee1a8762adb204d035d590331b [revision 2003] Author: Fiona Glaser Date: Tue Jun 14 10:26:56 2011 -0700 Use -vsync 0 with ffmpeg regression test commit 06fbd9df654cd2b7a025c12b3a7d4b3fb3386e23 [revision 2002] Author: Henrik Gramner Date: Sat May 21 19:04:46 2011 +0200 Inline emms instructions on x86 if possible commit f7c6d308f38b3193dbb7bd9f427252e296dfcbfe [revision 2001] Author: Fiona Glaser Date: Tue Jun 14 09:35:03 2011 -0700 Make left_index_table const Should allow for some missed compiler optimizations in macroblock_cache_load. commit ca7852e211b5a270a8e4752526378a898f669017 [revision 2000] Author: Hii Date: Tue May 24 08:31:17 2011 +0800 Make --profile main/baseline force off CQMfile commit ae2a6d80432fe5fa024227742043aca976795d38 [revision 1999] Author: Anton Mitrofanov Date: Wed Jun 1 02:11:56 2011 +0400 Fix VBV bug caused by zero i_row_satd value for first and last row commit 0ce601d591e6dd029c4ae05e02f0d01bcbdcca14 [revision 1998] Author: Anton Mitrofanov Date: Tue May 31 00:13:22 2011 +0400 Fix crash with VBV + forced QP commit 6633bb9e35880c59cc23f176954f10c36db85a2b [revision 1997] Author: Anton Mitrofanov Date: Mon May 30 02:36:31 2011 +0400 Fix VBV bug with MinCR limit commit 0279e564a419353917e8fffd42e9ef737b25d740 [revision 1996] Author: Fiona Glaser Date: Fri May 20 10:43:28 2011 -0700 Fix bitstream reallocation with slice-max-size + MBAFF commit 5a37283db5c7c6b39d7ce7dc69a19480aff3c320 [revision 1995] Author: Nikoli Date: Fri Apr 29 14:19:22 2011 +0400 Improve build system capabilities Make static lib and CLI optional. Support linking CLI to system libx264. Don't strip by default, to match GNU packaging guidelines. commit 6c54a135f5d552cbed4d3067aae2621ffb4f73af [revision 1994] Author: Fiona Glaser Date: Tue Apr 26 05:12:26 2011 -0700 Slightly speed up x86 CABAC asm Also make some various cleanups. commit bee57e6df38792a01053a96cecc3ecd30a2df434 [revision 1993] Author: Fiona Glaser Date: Wed May 4 23:26:19 2011 -0700 Faster pixel_memset ~4x faster. Also inline plane_expand_border for improved constant propagation. commit 11130b0cf9192a296ba8a1521b5f80219294a6d7 [revision 1992] Author: Fiona Glaser Date: Thu May 5 00:42:43 2011 -0700 Add checkasm tests for memcpy_aligned, memzero_aligned Also make memcpy_aligned support sizes smaller than 64. commit 7ad06554bb2fbec2b543417bdaab15e0ac4bc366 [revision 1991] Author: Fiona Glaser Date: Sun May 8 18:46:52 2011 -0700 MBAFF: Add regularization to VSAD metric Bias towards the MBAFF decisions made in neighboring mb pairs. ~2% better compression on a random 1080i HDTV source. commit 1c7caa534ad1f61cd20587626e46275d2a8c7731 [revision 1990] Author: Fiona Glaser Date: Sun May 8 18:46:39 2011 -0700 MBAFF: Improve handling of bottom row mod32 padding Force skip on any MBs entirely outside the frame If an mb pair in the bottom row is chosen to be progressive, re-pad the bottom rows progressively. commit 52b3d8031b82e7672033bdc60899c1a5acf0e3b3 [revision 1989] Author: Fiona Glaser Date: Sun May 8 19:17:36 2011 -0700 MBAFF: Add frame/field MB stats commit a313dc97952e6f004475f044c181d0df3b7430af [revision 1988] Author: Simon Horlick Date: Wed Apr 27 12:49:25 2011 +0100 MBAFF: Template direct spatial commit 73dec30bc846277a64667f41b8a09e295273e896 [revision 1987] Author: Simon Horlick Date: Mon Apr 25 21:22:59 2011 +0100 MBAFF: Template cache_load and cache_load_neighbours commit 08502a7c8d6e1bad719d958a546816c55791676e [revision 1986] Author: Simon Horlick Date: Mon Apr 25 09:06:24 2011 +0100 MBAFF: Make interlaced support a compile time option commit 8029e6640967ee71b4ff94233615a5e291da62f4 [revision 1985] Author: Simon Horlick Date: Sun Apr 17 10:05:51 2011 +0100 MBAFF: Don't call zigzag_init for every mb commit 874f9b5c23dbec867b5db2a29e466487d180f9d6 [revision 1984] Author: Simon Horlick Date: Fri Mar 25 13:36:21 2011 +0000 MBAFF: Modify ratecontrol to update every two rows commit a5fa92aab92d4d3bceb1147a3055dc1d63409d9c [revision 1983] Author: Simon Horlick Date: Wed Mar 23 21:55:03 2011 +0000 MBAFF: Add support for slice-max-size Also add slice-max-size to the regression tests. commit 63b0255d6e991343d5afbe241d9be85be501584e [revision 1982] Author: Simon Horlick Date: Wed Mar 23 21:54:21 2011 +0000 MBAFF: Add support for slice-max-mbs commit 002695f017a507253c58ac9b453dc6e69d769dc6 [revision 1981] Author: Simon Horlick Date: Thu Mar 17 17:39:18 2011 +0000 MBAFF: Adaptive quantization Compute energy for interlaced and progressive choices and pick the least. commit ff33967025a683098e5008d9a0684ec068e04a85 [revision 1980] Author: Simon Horlick Date: Mon Mar 14 02:54:30 2011 +0000 MBAFF: Enable adaptive MBAFF with VSAD decision commit fea488715ec64542ce87a3312644bcfda994d6d9 [revision 1979] Author: Simon Horlick Date: Sat Apr 23 10:44:04 2011 +0100 MBAFF: Create a VSAD DSP function x86 assembly by Fiona Glaser. This gives roughly 30x speed increase over the C version. commit 3a7194f124fbf99cf3a3ca7aef45790196546f88 [revision 1978] Author: Simon Horlick Date: Tue Mar 15 01:17:01 2011 +0000 MBAFF: Direct spatial commit 27c5c5d090ca5f64ae2998e1de70480c46bce87b [revision 1977] Author: Simon Horlick Date: Tue Mar 15 01:16:20 2011 +0000 MBAFF: Direct temporal commit 638092d32aafe22201bae8a63fa5e7a005e4a7de [revision 1976] Author: Simon Horlick Date: Tue Mar 15 01:15:06 2011 +0000 MBAFF: Calculate bipred POCs Need to calculate two tables for the cases where the current macroblock is progressive or interlaced as refs are calculated differently for each. commit d5c57af4055d4177ca1c67fe7e3ac36e07dca179 [revision 1975] Author: Simon Horlick Date: Tue Mar 15 01:39:49 2011 +0000 MBAFF: Use both left macroblocks for ref_idx calculation commit 5eccc7d2263ffd63921b95b9e95152ffb6b3645c [revision 1974] Author: Simon Horlick Date: Sun Apr 3 15:23:35 2011 +0100 MBAFF: First edge deblocking commit a71cd871325158577a3be8ed96e8abfe22645042 [revision 1973] Author: Simon Horlick Date: Mon Mar 21 11:03:23 2011 +0000 MBAFF: Implement left edge deblocking functions commit 94b9141609d17ebbeb3184a8a5fc0660725a4cf2 [revision 1972] Author: Simon Horlick Date: Sat Apr 2 18:27:13 2011 +0100 MBAFF: Add extra data to the deblock strength structure commit 8c2114db1938e04b4122fcb96b6380329bf1cf31 [revision 1971] Author: Simon Horlick Date: Wed Mar 16 21:27:07 2011 +0000 MBAFF: Deblocking support commit efcfead77b71e65eaf3412b02c525ba6b1f59c90 [revision 1970] Author: Simon Horlick Date: Mon Mar 21 11:02:27 2011 +0000 MBAFF: Move common code from deblock functions commit e30bb5318d32ae107560c7242b2f361abed0c6a6 [revision 1969] Author: Simon Horlick Date: Wed Mar 16 21:18:59 2011 +0000 MBAFF: Add mbaff deblock strength calculation Move call to deblock_strength to x264_macroblock_deblock_strength to keep deblock strength calculation in one place. commit be87b09f435052a22f7ac15a469d75964d572b2f [revision 1968] Author: Simon Horlick Date: Thu Apr 21 01:47:53 2011 +0100 MBAFF: Update x264_cabac_mvd_sum_mmxext to work with larger MVDs. Author: Loren Merritt commit 7e081bd859a7ca89e702e80a9f8997064d93b196 [revision 1967] Author: Simon Horlick Date: Tue Mar 29 15:47:04 2011 +0100 MBAFF: Clamp MVDs to 66 instead of 33 commit 02d7ef5f1696396892a60d6ce6f6cec2408df92b [revision 1966] Author: Simon Horlick Date: Tue Mar 29 15:46:34 2011 +0100 MBAFF: CABAC encoding of skips commit 532bb60a6695ba1ceb698747e57107de422491f2 [revision 1965] Author: Simon Horlick Date: Sun Feb 20 15:31:55 2011 +0000 MBAFF: Track what interlace decision the decoder is using commit 313d3770baa69915f7f5e7ba87c7aa79110b1ab4 [revision 1964] Author: Simon Horlick Date: Sun Feb 6 22:58:39 2011 +0000 MBAFF: Fix mvy bounds Fix MV clipping commit 134ed96db6397bb7b8f1a5ff792ef1a0908d8cf1 [revision 1963] Author: Simon Horlick Date: Wed Mar 16 21:34:51 2011 +0000 MBAFF: Copy deblocked pixels to other plane commit 7055d13f2f55ad7e78c1c12aad3a7024ac0be7f1 [revision 1962] Author: Simon Horlick Date: Tue Mar 29 20:26:33 2011 +0100 MBAFF: Disallow skip where predicted interlace flag would be wrong commit a1974d1cb8ad9e1c6e3a7eedf09eac7c5ce6b162 [revision 1961] Author: Simon Horlick Date: Tue Mar 29 20:25:23 2011 +0100 MBAFF: Inter support commit 2e5fc7235679bc968e3d4ac73ad1f39fa68b9987 [revision 1960] Author: Simon Horlick Date: Fri Jan 14 21:18:14 2011 +0000 MBAFF: Neighbour calculation Back up intra borders correctly and make neighbour calculation several times longer. commit 689a8258fd9a805cc46ae16fc2a0a22c31f8b76f [revision 1959] Author: Simon Horlick Date: Tue Jan 11 20:21:26 2011 +0000 MBAFF: Store references to the two left macroblocks commit a13ba181b7e1bcc42d6c8855de2c20ce1b652591 [revision 1958] Author: Simon Horlick Date: Tue Jan 11 20:16:18 2011 +0000 MBAFF: Store left references in a table commit 740f203f8248450430eca5aeac74bcb0f3269c64 [revision 1957] Author: Simon Horlick Date: Tue Jan 11 20:09:00 2011 +0000 MBAFF: Disable adaptive MBAFF when subme 0 is used commit a11114e6da54932f59afb02ea13ae41aaf4f3f98 [revision 1956] Author: Simon Horlick Date: Tue Jan 11 20:05:54 2011 +0000 MBAFF: Save interlace decision for all macroblocks commit e64d7a9fdb92005399de1146177f155760897049 [revision 1955] Author: Fiona Glaser Date: Thu May 12 10:21:16 2011 +0800 Fix bug in NAL buffer resizing Also properly terminate if NAL buffer resizing fails. commit c4c995a8838f673ded01cf85ab023c2b7578106d [revision 1954] Author: Anton Mitrofanov Date: Thu May 5 16:27:49 2011 +0400 Fix zone bitrate multiplier and QP forcing in 2-pass mode Previously zone changes could affect frames outside of the given frame range (around 20 neighboring frames). commit 4a9ead8bfa95d7c93ca022242ac7ae4912deb776 [revision 1953] Author: Fiona Glaser Date: Thu May 5 03:24:38 2011 -0700 Use float constants in qp rounding Slight performance improvement and fixes slight difference in output between gcc 3.4 and 4.5. commit 97400fceb9388dda330ca05221b01989097e9496 [revision 1952] Author: Anton Mitrofanov Date: Wed May 4 11:49:06 2011 +0400 Fix bugs with ratecontrol reconfiguration Initialization of some parameters was missed or wasn't synchronized with other threads commit 41c56f5eebaade1c46a9124195c751d4d3d24daa [revision 1951] Author: Anton Mitrofanov Date: Wed May 4 11:45:58 2011 +0400 More validation of input parameters This fixes a crash with --me umh and insane values of --me-range. commit 91965e48fa97c194ab0c661b3d6d41e949426097 [revision 1950] Author: Anton Mitrofanov Date: Sun May 1 17:28:56 2011 +0400 Fix bug in --b-adapt 2 with --rc-lookahead >248 Problem caused by buffer overflow in strcpy. commit 788c2881c09795dbe2c00c8e73b0bfb4664c90d5 [revision 1949] Author: Oka Motofumi Date: Thu Apr 28 13:13:49 2011 +0900 Check for invalid pixfmts in lavf demuxer commit 80a661e2364ede32e5797eeb5e7bfec452016082 [revision 1948] Author: Fiona Glaser Date: Tue May 10 01:58:21 2011 -0700 Fix regression in r1944 Broke sliced-threads + slice-max-size/slice-max-mbs. commit 29a58f4a1d148667eb0bd8eca07189f5d30d1142 [revision 1947] Author: Fiona Glaser Date: Sun Apr 24 18:36:26 2011 -0700 Precalculate CABAC initialization contexts Slightly faster encoding with lots of slices. commit 866bf26018c8d76c475d23e7fe028774e8ec9814 [revision 1946] Author: Fiona Glaser Date: Sat Apr 23 21:22:13 2011 -0700 Avoid redundant log2f calls in mv cost initialization Saves around 100 million clock cycles on x264 init. commit 040d45415d25547033f99ae059dbcf055583d8d2 [revision 1945] Author: Fiona Glaser Date: Thu Apr 21 14:19:05 2011 -0700 CABAC residual: cleanup and optimizations Also kill all Hungarian notation while we're at it. Trim an instruction off cabac_encode_bypass. commit 773d969b848af3440735e05cd06c14026232a0cf [revision 1944] Author: Anton Mitrofanov Date: Wed Apr 20 02:54:49 2011 +0400 Validate input parameters more carefully Get rid of redundant warnings upon encoder_reconfig calls. Also avoid encoder_reconfig turning off psy_rd/trellis. commit 7b3d7364e006f9e240c44ba9c5a43094c68e0892 [revision 1943] Author: Anton Mitrofanov Date: Fri Apr 22 01:13:28 2011 +0400 Fix VFR MB-tree to work as intended Should improve quality with FPSs much larger or smaller than 25. commit 303449825e3424d8661fd43dac170e5d85a09d4c [revision 1942] Author: Yasuhiro Ikeda Date: Sun Apr 24 15:33:45 2011 +0900 Support more recent GPAC versions commit aa4e80ddbbca8a9ab217dfc8686efea5ebbf4315 [revision 1941] Author: Anton Mitrofanov Date: Sat Apr 23 15:19:40 2011 +0400 Fix decoder desync with positive --chroma-qp-offset and zones commit e9389034a5e4812a185f4b66654925d8adf4c437 [revision 1940] Author: Anton Khirnov Date: Wed Apr 20 10:53:44 2011 +0200 Use AVMEDIA_TYPE_VIDEO instead of deprecated CODEC_TYPE_VIDEO Fixes build with lavf/lavc 53. commit 039675b4d28731181bd49a0a076fb72148d8e962 [revision 1939] Author: Fiona Glaser Date: Fri Apr 15 16:33:27 2011 -0700 Force pic-struct for Blu-ray compat + fake-interlaced commit 24db70d508d5afdc4e9f5ba017aa875d80fc1487 [revision 1938] Author: Anton Mitrofanov Date: Thu Apr 14 12:14:52 2011 -0700 Fix open-gop with no-psy commit 0a937473a99701d986bb285056438355b0998a96 [revision 1937] Author: Steven Walters Date: Thu Apr 14 11:09:45 2011 -0700 Fix build with disabled asm commit 42693c88c7c10f6156a4bb3a980a04eb23f02276 [revision 1936] Author: Fiona Glaser Date: Wed Apr 6 02:16:42 2011 -0700 Improve Blu-ray compliance Use dec_ref_pic_marking SEIs to repeat B-ref referencing information. Don't allow B-frames to reference frames outside their minigop. commit e54ea0c803b63af5c473a6218ee466d5b34e5d5c [revision 1935] Author: Fiona Glaser Date: Wed Apr 6 17:15:50 2011 -0700 Consolidate Blu-ray hacks into --bluray-compat This option is now required for Blu-ray compatibility. --open-gop bluray is now gone (using bluray-compat and open-gop implies a Blu-ray compatible open-gop). This option doesn't automatically enforce every aspect of Blu-ray compatibility (e.g. resolution, framerate, level, etc). commit bdb88277a5080cd120df699373431cee95d57bc8 [revision 1934] Author: Fiona Glaser Date: Tue Mar 29 05:33:44 2011 -0700 Add SSE support to rectangle.h for 16-byte stores Uses GCC vector intrinsics; may be suboptimal on particularly old GCC versions. commit c52f879268118212ac12d8edd7943210726855fb [revision 1933] Author: Steven Walters Date: Tue Apr 12 19:22:56 2011 -0400 Do not force Intel Compiler to target pre-mmx architecture for x86 Caused a speed penalty against gcc equivalents. commit 97797d2dd4b41e22af651accd41c29e2a469decb [revision 1932] Author: Fiona Glaser Date: Tue Apr 12 01:16:48 2011 -0700 Warn users when using --(psnr|ssim) without --tune (psnr|ssim) This is a counter to the proliferation of incredibly stupid psnr/ssim "benchmarks" of x264 in which the benchmarker conveniently "forgot" --tune psnr/ssim, crippling x264 in the test. commit 59ce517a0213bd8505bb4e6315e2970df04dae6e [revision 1931] Author: Dylan Yudaken Date: Thu Apr 7 16:06:19 2011 -0700 Remove redundant mbcmp calls in weightp analysis commit 7b4a60338d8e1465d1f617eaa326289c16b427e8 [revision 1930] Author: Anton Mitrofanov Date: Wed Apr 6 22:48:57 2011 +0400 Use integer math for filler size calculation commit 0c3054f0dfc84b99b8305ebbeb647533a741994d [revision 1929] Author: Anton Mitrofanov Date: Tue Apr 5 14:06:54 2011 +0400 Disable progress for FFMS input with --no-progress commit 9f38a5bf62ee6bd16444243066dd4f01aceace16 [revision 1928] Author: Michael Stuurman Date: Thu Mar 31 13:45:22 2011 -0700 Fix bug in intra-refresh ratecontrol Row SATDs were slightly incorrect. commit d6daf2b914a658ecc57346a7348f5f8400b003d2 [revision 1927] Author: Fiona Glaser Date: Sun Apr 10 04:39:51 2011 -0700 Cosmetics: fix some signedness issues found by -Wsign-compare commit 2246e451e0545a534144f04ef5f2b5d23c2f1a38 [revision 1926] Author: Fiona Glaser Date: Sun Apr 3 16:31:52 2011 -0700 Minor fixes Fix a comment typo. Align an array properly. Make x264_scan8 unsigned: saves a bunch of movsxd instructions on x86_64. commit 1d9b1bc7bc75c35aee7c8f6e0a2ef80bfefc57ec [revision 1925] Author: Anton Mitrofanov Date: Fri Mar 25 00:08:40 2011 +0300 Improve C99 support checks in configure Fixes configuration with Intel compiler in some cases. commit 6cbc47d476f610218c7e973d2c806b24bb4dd1b9 [revision 1924] Author: Fiona Glaser Date: Fri Mar 18 18:24:33 2011 -0700 Eliminate the possibility of CAVLC level code overflow Instead, if it happens, just re-encode the MB at higher QPs until it fits. commit 34e3a69755995a23c1f10f34321521af4182e559 [revision 1923] Author: Henrik Gramner Date: Sat Mar 12 23:21:09 2011 -0800 x86 SIMD versions of optimize_chroma_dc SSE2/SSSE3/SSE4/AVX implementations. About 3x faster. commit 49a32b91eda5afc05e8a4a22577f6182987205c6 [revision 1922] Author: Manuel Rommel Date: Sun Dec 26 18:52:49 2010 +0100 Add Altivec version of mc_weight commit bd38b231d12f4deb9d0d43b1f5f22c157e1b115c [revision 1921] Author: Manuel Rommel Date: Sun Dec 26 21:41:33 2010 +0100 Add Altivec versions of mbcmp_x functions These aren't merged versions, they just call the existing asm code. A merged implementation would of course be faster. commit 591d45ee98b29e92d14f1fff06f50c24d9f9262a [revision 1920] Author: Steven Walters Date: Wed Mar 2 21:31:27 2011 -0500 Recognize cygwin as itself when not targeting mingw Also fix broken thread detection on cygwin. commit e19e206cbf0547ebf0394d9542c429c55bd5409a [revision 1919] Author: Steven Walters Date: Wed Mar 2 20:39:25 2011 -0500 Patch Intel's CPU dispatcher Reduces Intel Compiler's bias against non-Intel CPUs. Big thanks to Agner for the original information on how to do this. commit 4c624dccf4d1e13653be90c26dac49664c0f8241 [revision 1918] Author: Steven Walters Date: Mon Feb 28 19:07:40 2011 -0500 Intel Compiler support Big thanks to David Rudie, the original author of this patch. commit 2e66a20694b51a36246d04008aa526cba48d625c [revision 1917] Author: Kieran Kunhya Date: Tue Mar 8 09:41:46 2011 +0000 Cosmetics: make struct definition braces consistent commit f2b079718b8658bf453f0276d28503984a6dcff1 [revision 1916] Author: Fiona Glaser Date: Wed Mar 23 20:59:41 2011 -0700 Fix restoring of console title on Windows with ffms indexing commit 86f8a74a117fc697c58822c0dd7d9d841959151c [revision 1915] Author: Anton Mitrofanov Date: Thu Mar 10 00:31:26 2011 +0300 Fix possible buffer overflow in mp4 muxer commit d78a296270c3d8bbff6d81176eb510c1a75d23c9 [revision 1914] Author: Nick Lewycky Date: Mon Mar 7 18:10:36 2011 -0800 Remove inline asm syntax not supported by LLVM's assembler Doesn't affect compiled output outside of LLVM. commit 937cae7115f4ef42ecc285c639b533456226b0c1 [revision 1913] Author: Fiona Glaser Date: Fri Feb 18 17:50:42 2011 -0800 Fix 10L in r1912 SSSE3 code got used in MMX/SSE2 and vice versa (in hpel). commit abc2283e9abc6254744bf6dd148ac25433cdf80e [revision 1912] Author: Daniel Kang Date: Sat Jan 15 13:44:45 2011 -0500 Add AVX functions where 3+ arg commands are useful commit 7f918d15fbd8d6c65ae1548c058765ebc4b83203 [revision 1911] Author: Fiona Glaser Date: Mon Feb 7 03:15:03 2011 -0800 Frame-packing 3D: don't place scenecuts on right views Caused problems for some players. commit 3202f34117d0850eec9ec937cbb5fa72f89b849b [revision 1910] Author: Fiona Glaser Date: Fri Feb 11 00:54:51 2011 -0800 Improve slice-max-size handling of escape bytes More accurate but a bit slower. Helps deal with a few obnoxious corner cases where the current algorithm failed. commit afd969a67bc6f69dccd71e5e7e68755c49212cac [revision 1909] Author: Nathan Caldwell Date: Thu Feb 10 21:25:00 2011 -0700 Use bs_write1 wherever possible in header writing commit c2659d26be6c20727aa78b699b1d282b3a3f2718 [revision 1908] Author: Fiona Glaser Date: Tue Feb 8 14:48:18 2011 -0800 Remove obsolete mvcost init code commit 03a8f4c8e32bf03096344244271ade318e252ce1 [revision 1907] Author: Anton Mitrofanov Date: Sun Feb 13 12:19:13 2011 -0800 Fix memory leak on encoder close if not all frames are flushed commit 228f57c2121b8473001bb58b13a075cedca033e7 [revision 1906] Author: Anton Mitrofanov Date: Sat Feb 12 05:19:55 2011 -0800 Fix signedness bug in CPU detection Luckily didn't affect anything due to C signedness rules. commit 33a44b55b815f135bdb46d77e660eaef56dc42b6 [revision 1905] Author: Fiona Glaser Date: Fri Feb 11 13:47:27 2011 -0800 Fix dumb bug caused by stray semicolon Caused noise reduction to run incorrectly in part of RD, but probably had no effect. commit 4a3e072ecd0e335c444dc80e49db0e6eaf59cef2 [revision 1904] Author: Anton Mitrofanov Date: Thu Feb 10 05:05:53 2011 -0800 Fix malloc of zero size Caused x264 to fail with some settings on systems that return a NULL pointer for malloc(0), like Solaris. commit f0b8dd33c0aa2a3487ca567d1f5207c90b1e314c [revision 1903] Author: Anton Mitrofanov Date: Wed Feb 9 23:01:07 2011 +0300 Fix crash in mp4 muxer after failure of x264_encoder_open commit 24caade52252a0a41c4869525dde8e5a47c55063 [revision 1902] Author: Fiona Glaser Date: Wed Feb 9 11:36:02 2011 -0800 Fix shadowed variable warning in ffms.c commit eaf5ce20cfd35c9fbb37e64e066ec61bd4bc5fcf [revision 1901] Author: Anton Mitrofanov Date: Wed Feb 9 11:29:23 2011 -0800 Fix some Intel compiler warnings commit d147eea3cde028059f8c3ed65c49ad6692ecd629 [revision 1900] Author: Fiona Glaser Date: Sun Feb 6 23:12:09 2011 -0800 Fix 10L in r1886 Aspect ratio can't be set before SPS is initted. commit f1ae384f3f1987a389b1226150700bc83824c10e [revision 1899] Author: Fiona Glaser Date: Fri Feb 4 20:49:45 2011 -0800 Improve update interval of x264cli progress information Now updates every 0.25s instead of every N frames. commit 3fa4f5d25ae66ce7fd151c729ceceae13ec364b7 [revision 1898] Author: Fiona Glaser Date: Sat Feb 5 01:16:49 2011 -0800 Windows: restore previous console title after encoding MSDN docs claim that SetConsoleTitle's effect is reverted when the process terminates, but this doesn't always work properly. Accordingly, manually revert the console title at the end of encoding. commit 7e288fcf3e5bd19afcfa8790976c75c7f6682731 [revision 1897] Author: Fiona Glaser Date: Sat Feb 5 15:02:34 2011 -0800 Allow WEIGHTP_FAKE in interlaced mode It seems to work fine as-is even though real weightp doesn't support interlacing yet. commit c1212c02dfb59118ac4363f61bbf3464042c250e [revision 1896] Author: Kieran Kunhya Date: Wed Feb 2 11:01:13 2011 +0000 Output pic struct information in libx264 API commit 3240ec6c5284214c7af9f02dffd285014b3dae5d [revision 1895] Author: Fiona Glaser Date: Sun Jan 30 00:00:09 2011 -0500 Enable FastShuffle on Penryn and Nehalem CPUs without SSE4 commit 12a37e22cf8c236433ccc8f105a85cd631fff685 [revision 1894] Author: Yasuhiro Ikeda Date: Thu Feb 3 10:54:44 2011 +0900 Minor cosmetics in configure commit 6e57cced1034afa104358f6f12a70197181ad006 [revision 1893] Author: Anton Mitrofanov Date: Fri Jan 28 18:44:24 2011 -0800 Various --help cosmetics commit c00e15b76b35beb95e66d39cf67828c36191d6e1 [revision 1892] Author: Loren Merritt Date: Sun Jan 30 02:27:32 2011 -0800 x86inc.asm: error on duplicate functions Compile error if there's two functions of the same name, instead of silently renaming one of them. commit ff3c1be48a673c479ad5d51ef1e97f59a369a035 [revision 1891] Author: Fiona Glaser Date: Mon Jan 31 13:56:23 2011 -0800 Bump yasm version requirement to handle AVX commit b7c745c4a747629daba4dc6f765d32293cb4f3d6 [revision 1890] Author: Fiona Glaser Date: Fri Feb 4 20:48:37 2011 -0800 Fix rare corruption with subme=10 Also fix dumb bug in VBV emergency AQ scaling. Regression in r1881. commit f2ced3ff5f42784efe1b1d37738a645aad3fd52a [revision 1889] Author: Mans Rullgard Date: Thu Feb 3 13:32:06 2011 -0800 Fix overflow in ARM NEON i16x16 planar pred Patch backported from ffmpeg. commit 716cf882d7b31e6ffd9b00658a53227900e56cad [revision 1888] Author: Anton Mitrofanov Date: Wed Feb 2 22:51:45 2011 +0300 Fix incorrect frame duration for VFR input for some frames commit d7c05794b6645de764ca9f9a0b71b15f9761eeda [revision 1887] Author: Anton Mitrofanov Date: Tue Feb 1 00:43:03 2011 +0300 Fix possible division by zero in mkv and flv muxers on close This could crash if anything failed before output.set_param (for example, incorrect params refused by x264_encoder_open). Bug introduced in r1873. commit 8c881320850542a496be53a107d1a13290a03785 [revision 1886] Author: Fiona Glaser Date: Fri Jan 28 15:19:06 2011 -0800 Fix reconfiguration of parameters that modify the SPS For now, this is only aspect ratio. commit 49638791347bf895cfa6ce1d3985947fb905659e [revision 1885] Author: Fiona Glaser Date: Fri Jan 28 14:03:08 2011 -0800 Fix possible crash on Phenom with lookahead thread disabled Misalign mask needs to be set for the main thread on entry, too. commit 2ae5b902b8a1a3275d31586841d12dc3191b1389 [revision 1884] Author: Anton Mitrofanov Date: Sat Jan 29 12:43:34 2011 -0800 Hotfix for some bugs in VBV emergency commit 2f676f6f7966a536536e4d33829a8030a0694259 [revision 1883] Author: Fiona Glaser Date: Thu Jan 27 13:59:20 2011 -0800 Fix warnings in cpu.c commit ce7ee9d2eeed6a81ab9b2a7d8d9f4bc5674c18c7 [revision 1882] Author: Fiona Glaser Date: Thu Jan 27 05:33:25 2011 -0800 Check for OS AVX support in addition to CPUID Even if not using ymm registers, AVX operations will cause SIGILLs on unsupported OSs. On Windows, AVX is only available on Windows 7 SP1 or later. commit e6025413ea3e4d9ee0e4b1e1b4d38a9eeb949d49 [revision 1881] Author: Fiona Glaser Date: Tue Jan 18 00:52:03 2011 -0800 VBV emergency mode Allow ratecontrol to select "quantizers" above the maximum. These "quantizers" progressively decimate the source to avoid VBV underflow. x264 is now VBV compliant even with input as evil as /dev/random. commit 68cda11b73471d090776cdbe5dbff7f8563fadb5 [revision 1880] Author: Fiona Glaser Date: Wed Jan 12 09:54:33 2011 -0800 Initial AVX support Automatically handle 3-operand instructions and abstraction between SSE and AVX. Implement one function with this (denoise_dct) as an initial test. x264 can't make much use of the 256-bit support of AVX (as it's float-only), but 3-operand could give some small benefits. commit 8fb87147d3152fb37724d7c2996ade9263ddd90e [revision 1879] Author: Fiona Glaser Date: Tue Jan 11 11:04:52 2011 -0800 Double the base framerate for frame-sequential 3D files A 60fps frame-sequential 3D file is really only 30 FPS, just alternating between eyes. Accordingly, ratecontrol should treat it as if it was really 30 FPS. This will increase the bitrate at the same CRF level for such videos when --frame-packing 5 is used. commit b2bf3f99c060fdbd930e9ed5500a05da1344c229 [revision 1878] Author: Yasuhiro Ikeda Date: Thu Jan 20 23:12:01 2011 +0900 Add --input-fmt option to lavf input Conforms to ffmpeg's `-f` option. Use this when lavf fails to guess the input format. commit 240c82e70f68c430f459b067a811de8918ca7d79 [revision 1877] Author: Tony Young Date: Fri Jan 21 13:06:28 2011 -0800 Two improvements to regression test script Use SHA-1 hashes for temporary file names to avoid exceeding OS filename length limits. Correctly return to the original branch after testing if you were on a branch. commit 123b298d98ea67f80a039d9a0b3b2519247e4922 [revision 1876] Author: Vittorio Giovara Date: Fri Jan 14 10:02:33 2011 -0800 Add some missing values to the non-extended SAR table commit ee9bc136e9e6f0875308c9505a08360294e7cd4a [revision 1875] Author: Sean McGovern Date: Fri Jan 14 02:10:12 2011 -0500 Bump dates to 2011 commit e0b101821f3a900fa2958194cb316a3440455d60 [revision 1874] Author: Fiona Glaser Date: Tue Jan 18 12:31:26 2011 -0800 More correctly write frame-packing SEI flags Bug reported by Nero. commit 6d995330cc86f4d914ee718492121618bb0f37b6 [revision 1873] Author: Fiona Glaser Date: Thu Jan 20 14:45:57 2011 -0800 Don't die in x264_encoder_close if an error occurred in x264_encoder_encode Also clean up properly in x264.c (mostly useful for finding bugs in cleanup). commit 5696ec3b8c69b57bd7bf0692c31578439fce3b5d [revision 1872] Author: Fiona Glaser Date: Sun Jan 23 21:03:14 2011 -0800 Fix reconfiguration of b_tff Attempting to change field order during encoding could cause slight corruption. Also fix delta_poc_bottom to be correctly set if interlaced mode is used without B-frames. commit d4fbb266a0077e1c90e4f3baf19610db1565ecba [revision 1871] Author: Steven Walters Date: Sun Jan 23 15:19:11 2011 -0500 Fix x264 CPU detection with >=64 CPUs on Windows x264 won't actually use more than one processor group's worth of CPUs, however. This isn't a problem, as a single x264 instance can't effectively use a full 64 cores anyways. commit 31467f3270e791d1fd4728abd6f84c35819f757e [revision 1870] Author: Holger Lubitz Date: Fri Jan 21 19:13:57 2011 +0100 Remove high bit depth mmx quant It was using pmuludq which is sse2, and the function isn't really possible without pmuludq. commit fb223b970976fab3edab500b112f670d6ec8dd2d [revision 1869] Author: Holger Lubitz Date: Sat Jan 22 16:49:23 2011 +0100 Fix cacheline check in avg2 w20 cache32 Didn't result in incorrect output, only slightly decreased speed on a few obsolete systems. commit bdf0ac7468b5543a259fd5dcf6f17474ead4fb05 [revision 1868] Author: Holger Lubitz Date: Fri Jan 21 17:17:29 2011 +0100 Fix illegal instruction in high bit depth ssd_nv12_mmxext Unfortunately paddq isn't available in mmxext, only in sse2 and up. Also fixes to actually allow widths up to 16416/32832 without overflow. commit c583687fab832ba7eaf8626048f05ad1f861a855 [revision 1867] Author: Fiona Glaser Date: Thu Dec 23 19:33:01 2010 -0500 VFR/framerate-aware ratecontrol, part 2 MB-tree and qcomp complexity estimation now consider the duration of a frame in their calculations. This is very important for visual optimizations, as frames that last longer are inherently more important quality-wise. Improves VFR-aware PSNR as much as 1-2db on extreme test cases, ~0.5db on more ordinary VFR clips (e.g. deduped anime episodes). WARNING: This change redefines x264's internal quality measurement. x264 will now scale its quality based on the framerate of the video due to the aforementioned frame duration logic. That is, --crf X will give lower quality per frame for a 60fps video than for a 30fps one. This will make --crf closer to constant perceptual quality than previously. The "center" for this change is 25fps: that is, videos lower than 25fps will go up in quality at the same CRF and videos above will go down. This choice is completely arbitrary. Note that to take full advantage of this, x264 must encode your video at the correct framerate, with the correct timestamps. commit 247f504d3c7ac64a87ed5a12bab0f6b99af5959c [revision 1866] Author: Fiona Glaser Date: Fri Dec 31 22:54:16 2010 -0500 Improve reference ordering in interleaved 3D video Provides a decent compression improvement when encoding interleaved 3D content (--frame-packing 5). Helps more without B-frames and at lower bitrates. Note that x264 will not do this optimization unless --frame-packing 5 is used to tell x264 that the source is interleaved 3D. Tests consistently show that interleaved frame packing is by far the best way to compress 3D content. It gives a ~35-50% compression benefit over separate streams or top/bottom or left/right coding. Also finally add support for L1 reference reordering (in B-frames). Also add support for reordered ref0 in L0 and L1 lists; could be useful in the future for other things. commit c081c854524099e65a2273bdbd67ac344b01ae03 [revision 1865] Author: Fiona Glaser Date: Tue Dec 21 20:58:10 2010 -0500 Cosmetics: fref0/1 -> fref[2] and i_ref0/1 -> i_ref[2] A much-needed refactoring, plus makes the next patch easier. commit e373f64359ec14cf4744a3d5a50eb3e00289805d [revision 1864] Author: Alex Wright Date: Sat Dec 25 19:31:00 2010 +1000 Check an extra offset during weightp analysis Up to 0.1 - 0.6 dB gain on some fade-ins with --weightp 1, less with --weightp 2. commit 8e3212863cd22b2c6f71acd61d575b7b25a7f1c1 [revision 1863] Author: Daniel Kang Date: Tue Jan 4 15:27:38 2011 -0500 SSE2 high bit depth SSIM functions Patch from Google Code-In. commit 770718bc498bbc215c3f0876013de2b2b3c1db32 [revision 1862] Author: George Stephanos Date: Sun Jan 2 11:26:10 2011 -0500 SSE2 high bit depth intra_predict_(8x8c|16x16)_p Patch from Google Code-In. commit bc8948fc0aa57bb9099dcd647fe5775322580e0a [revision 1861] Author: Joe Cortes Date: Fri Dec 24 21:33:57 2010 -0600 MMX high bit depth coeff_last4 Patch from Google Code-In. commit af617efc12d39396e758adaa2b7b0447aed683c3 [revision 1860] Author: Daniel Kang Date: Thu Dec 23 12:15:03 2010 -0500 SSE2 high bit depth zigzag_interleave_cavlc Patch from Google Code-In. commit 648147bbc16722e67173c588c662098267294d93 [revision 1859] Author: Daniel Kang Date: Wed Dec 22 17:53:08 2010 -0500 MMX/SSE2/SSSE3 high bit depth frame_init_lowres functions Patch from Google Code-In. commit 6b04221c78325a91e2b9b7a3e6deba86d4d23ed6 [revision 1858] Author: Daniel Kang Date: Thu Dec 23 23:19:39 2010 -0500 MMX high bit depth 4x4 intra predict functions DDR and HD directions, as well as making HU faster. Also enable some SSE2 versions of high bit depth functions that were added but not properly enabled. Patch from Google Code-In. commit f0f76f991280c3b90450095e7880b3791fa6a746 [revision 1857] Author: Daniel Kang Date: Wed Dec 22 16:51:22 2010 -0500 SSE2 high bit depth 8x8 intra predict functions DDL, DDR, VR, HU, and HD directions, as well as the 8x8 filter. Also make 8-bit MMX VR faster, by backporting the optimizations from the high bit depth version. Patch from Google Code-In. commit df5d19b45cd364b8015f09cf2eeb2c3cd7739039 [revision 1856] Author: George Stephanos Date: Wed Dec 22 15:44:03 2010 -0500 MMX/SSE2 high bit depth 8x8c intra predict functions Patch from Google Code-In. commit 1d22dd50b5792746ff28b2b4815c17c82bec5af3 [revision 1855] Author: Daniel Kang Date: Sun Dec 19 16:31:59 2010 -0500 MMX version of high bit depth plane_copy And various cosmetics. Patch from Google Code-In commit 341b61474a9bb29d9a1c1a007b7d0d1b0a10e117 [revision 1854] Author: Fiona Glaser Date: Sat Dec 18 12:40:13 2010 -0800 Faster x86 predict_8x8c_dc, MMX/SSE2 high bit depth versions commit a36face6a7d9669be6a6e40626d530ef9ff31f30 [revision 1853] Author: Fiona Glaser Date: Sat Dec 18 05:40:49 2010 -0800 SSSE3 high bit depth sad_aligned functions commit 6ecfa83c34b665ca9e98814babf4bd3e09ac6706 [revision 1852] Author: Daniel Kang Date: Thu Dec 16 04:41:17 2010 -0800 MMX/SSE2 high bit depth interleave functions Patch from Google Code-In. commit 15595e6d94940064046c61e64ef9cea993f3e05c [revision 1851] Author: Joey Geralnik Date: Wed Dec 15 09:14:56 2010 +0200 MMX/SSE2 high bit depth avg functions Patch from Google Code-In. commit c3937a516e9d865315662435fc03c42c31276b7e [revision 1850] Author: Daniel Kang Date: Tue Dec 14 22:47:51 2010 -0500 MMX/SSE2 high bit depth deinterleave functions Patch from Google Code-In commit 8bed3a1418edf4b146d84445e692b17cf854bbe5 [revision 1849] Author: Fiona Glaser Date: Wed Jan 5 23:29:36 2011 -0500 Shut up some incorrect gcc uninitialized variable warnings commit 116ff56c7f2b6b47fb73ae9fc30590caf038dc09 [revision 1848] Author: Anton Mitrofanov Date: Sat Dec 25 00:55:14 2010 +0300 Write --crop-rect and --frame-packing options to x264 SEI commit 3f700a324bc445bce02433ce2a1444501f4c929a [revision 1847] Author: Fiona Glaser Date: Wed Dec 15 13:00:14 2010 -0800 Add missing space to parameter SEI commit bb2d6b69e85797367fa29071c91c91a03a2daff2 [revision 1846] Author: Kieran Kunhya Date: Tue Dec 28 00:54:28 2010 +0000 Fix typo in documentation commit 7882a05dfe0c3de9f2e26dcf93f321ce65a3e82d [revision 1845] Author: Anton Mitrofanov Date: Sat Dec 18 08:29:18 2010 -0800 Fix redundant linebreaks in statsfile with weightp commit f5d4ca6a38ef6748dde02fe0603c1fa67fbd982f [revision 1844] Author: Ramiro Polla Date: Wed Dec 15 14:35:02 2010 -0200 Use cross_prefix for strings in endian test and as test commit 5f7967020eed7cb99dafe1366af9642d05cea3cd [revision 1843] Author: Anton Mitrofanov Date: Sun Jan 2 14:36:53 2011 -0500 Fix checkasm test for quant in high bit depth Eliminate some spurious failures. commit 0c4fa824ffacadf226cf68cedcf78602769d15d4 [revision 1842] Author: Steven Walters Date: Thu Dec 30 20:35:10 2010 -0500 Fix broken YV12 handling in the resize filter commit 712f6dff3ab96abeee4c7440302d81e245451d2c [revision 1841] Author: Fiona Glaser Date: Wed Jan 5 22:21:18 2011 -0500 Fix bug with negative lookahead mb costs in high bit depth commit 0aa25f66eaafecbc0b6eb86d24c04119e0454e76 [revision 1840] Author: Daniel Kang Date: Tue Jan 4 14:33:05 2011 -0500 Fix overflow in SSIM calculation in 10-bit commit 3c50b9b4cc52b91dcb71bfe2a542aee9fb9a9a97 [revision 1839] Author: Anton Mitrofanov Date: Fri Dec 24 14:52:57 2010 -0500 Fix some possible overflows in VFR ratecontrol with extreme timebases commit 5b91a48c7b88d27201800dc204e743bd2e76051a [revision 1838] Author: Steven Walters Date: Sun Jan 9 16:01:04 2011 -0500 Fix memory leak in lavf demuxer. Leak only occurred with input files that have more than one video stream. commit 50cae3cf1db1065a3570bd9ef29059c8ab49979e [revision 1837] Author: Fiona Glaser Date: Fri Dec 24 17:28:42 2010 -0500 Fix satd predictors with high bit depth Resulted in odd CRF-mode results with --no-mbtree, as well as suboptimal VBV handling. commit d50760c144dc0ee5023e166c7dd35dccd00a32b3 [revision 1836] Author: Daniel Kang Date: Fri Jan 7 23:05:50 2011 -0500 Fix compile error with high bit depth and disable-asm commit bab4eadd11ca59745dfce369c9fea427c73317a0 [revision 1835] Author: Fiona Glaser Date: Sat Dec 18 08:22:34 2010 -0800 Really fix gcc win32 misalignment crash gcc's -fno-zero-initialized-in-bss only works if an explicit initializer (e.g. = {0}) is used. commit 74ee50e539dc06bea6b4bbd2f674c21248c05970 [revision 1834] Author: Steven Walters Date: Sat Dec 11 20:30:29 2010 -0500 Support for native Windows threads Patch originally by Pegasys Inc. commit 25a1ffb266f409fe657d834b87c47e63cdaded3b [revision 1833] Author: Daniel Kang Date: Mon Dec 13 17:15:12 2010 -0500 MMX/SSE2 high bit depth weight_cache/offset(sub|add) functions Patch from Google Code-In. commit fd8cfd445016db99a99b7a4d3769e52599aeda0e [revision 1832] Author: Daniel Kang Date: Wed Dec 8 17:56:22 2010 -0500 SSE2 high bit depth dequant functions Patch from Google Code-In. commit 7271fc01d55944eb91ac7fdf2d4c96952bd609b2 [revision 1831] Author: Daniel Kang Date: Tue Dec 7 22:48:15 2010 -0500 SSE2 high bit depth zigzag functions Patch from Google Code-In. commit 6f4d6fe9b2abe7755d0e8f16375e790b82174c3b [revision 1830] Author: Daniel Alexandru Morie Date: Tue Dec 7 06:11:02 2010 -0800 MMX/SSE2 versions of high bit depth store_interleave Patch from Google Code-In. commit 898579cca8b0d2f7a63a4c3f0534226529e6e933 [revision 1829] Author: Vittorio Giovara Date: Fri Dec 10 20:43:00 2010 -0800 Add frame-packing SEI support for signalling 3D video commit dde9e9bc90fbd9e2553a04a5586085ba1500394e [revision 1828] Author: Fiona Glaser Date: Sat Dec 11 03:48:59 2010 -0800 Allow 8x8dct+cavlc+lossless with subme>=6 commit 7281537747fb52efec3272fd5a155bf7339d3e7a [revision 1827] Author: Yasuhiro Ikeda Date: Thu Dec 9 12:00:24 2010 +0900 Add interlaced/no-interlaced case to regression test script commit adb5d4bffb8bd9cf5cb170356bcdc931550da904 [revision 1826] Author: Yasuhiro Ikeda Date: Thu Dec 9 11:59:49 2010 +0900 Save more memory with weightp in >8-bit commit 031b37d34b6b74f6806e8602b57fbb4f325c33b2 [revision 1825] Author: Yasuhiro Ikeda Date: Thu Dec 9 11:57:38 2010 +0900 .gitignore more untracked file types commit 9e79fa38c4c0a98ae99b5d1d2e2e4480ed0c67d5 [revision 1824] Author: Anton Mitrofanov Date: Tue Dec 7 17:49:21 2010 +0300 Work around gcc/ld alignment bug on win32 Fixes problems due to misalignment of static zero arrays (win32 ld can't align .bss properly). commit 5043c17357f44bc875e4f90e586e236948826c72 [revision 1823] Author: Daniel Kang Date: Tue Dec 7 15:19:46 2010 -0500 Fix high bit depth intra pred functions And re-enable them accordingly. Patch from Google Code-In. commit 4fc1c711dc89541d251700b89d5e462a08a9f467 [revision 1822] Author: Fiona Glaser Date: Sat Dec 11 13:37:09 2010 -0800 Fix weightp analysis with high bit depth commit 2a3023965940f638aaeff35bab16651682af07c0 [revision 1821] Author: Oskar Arvidsson Date: Thu Dec 9 12:19:57 2010 +0100 Fix build error in high depth Caused by multiple definitions of x264_add8x8_idct_sse2. commit 972de279b363210fdf4858e3efe7203e979b6e36 [revision 1820] Author: Fiona Glaser Date: Tue Dec 7 03:15:46 2010 -0800 Hotfix for high bit depth Temporary fix for some unaligned access crashes. commit f68e1f8d5d9c9d9d7f0b2f91b587254d0bbec3da [revision 1819] Author: Anton Mitrofanov Date: Tue Dec 7 13:44:55 2010 +0300 Delete x264_config.h on distclean commit ef4a8d2e79049c8311a3ab78860557496688db93 [revision 1818] Author: Daniel Kang Date: Thu Nov 25 19:44:56 2010 -0500 Tons of high bit depth intra predict asm Patch from Google Code-In. commit c801fc6c2a5d0804c41ea77205fa049f10452dfe [revision 1817] Author: David Czech Date: Sat Nov 27 17:34:32 2010 -0800 SSE2 high bit depth 8x8/16x16 idct/idct_dc Patch from Google Code-In. commit abe11eaba564cfe564245dcac1f5e439a800ff1f [revision 1816] Author: Ramiro Polla Date: Tue Nov 30 02:17:23 2010 -0200 Create and install x264_config.h This header can be used to determine the bit-depth and license of libx264. commit ee6b482234b840e9dbce892b5b13f18f66d6fe54 [revision 1815] Author: Steven Walters Date: Wed Oct 13 21:53:50 2010 -0400 Detect Avisynth initialization failures Detect if there is a critical Avisynth initialization failure and print the associated error. This, however, requires a feature present in the latest version of Avisynth alpha (2.6). Previous versions are unaffected. commit 5b5b746834be4cb5c5d6ce275aed38e90d39cbd0 [revision 1814] Author: Fiona Glaser Date: Thu Nov 25 22:12:07 2010 -0800 Automatically restrict QPs to avoid quantization (under|over)flow --cqm jvt and similar should now work "out of the box" instead of requiring futzing with --qpmin. commit 90fa32a09d2f126c8be53cd2331c8bbd3f44fcce [revision 1813] Author: Anton Mitrofanov Date: Sat Dec 4 23:29:08 2010 +0300 Don't try to get timecodes if reading frame failed This fixes "input timecode file missing data for frame" warning with piped input where we don't know total number of frames. commit 8245feb264ce7b0ea75a654af9f90d74e45391e8 [revision 1812] Author: Oskar Arvidsson Date: Thu Nov 25 23:05:21 2010 +0100 Fix possible overflow in sub4x4_dct in 10-bit builds commit 4c9fe3fbedf99a8e4920233ee7df7d4e4fe27f0e [revision 1811] Author: Fiona Glaser Date: Mon Dec 6 14:19:09 2010 -0800 Fix bug in intra-refresh + threads Intra refresh bar quality increase wasn't correctly applied. commit 7f3fed96bf012921cf71e877423f4afaf6ffeb2b [revision 1810] Author: Fiona Glaser Date: Mon Dec 6 12:00:13 2010 -0800 Fix file handle leak in libx264 on error commit b15e52142bd25436dfb578fbee35eb89d337c015 [revision 1809] Author: Steven Walters Date: Sun Oct 10 18:17:35 2010 -0400 Fix incompatible csp format issue Problem occurred with unknown pixel formats and non mod2 resolutions in the resize filter. commit 88cc4b0d0e0e69107cfc26a7d4131341031ec4eb [revision 1808] Author: Anton Mitrofanov Date: Sat Nov 27 15:54:39 2010 -0800 Really fix fittobox resize rounding code commit 23060612f09601a0602ad9ff25fa9dd31c1b362a [revision 1807] Author: Yusuke Nakamura Date: Sun Dec 5 09:31:01 2010 +0900 Fix regression in rev1549 Skip auto timebase denominator generation when generated timebase denominator exceeds UINT32_MAX. Also fix double free. commit 6e41f7e2aa03aabbedccc40278f14ffbfb2cc0f8 [revision 1806] Author: Anton Mitrofanov Date: Sun Nov 28 01:05:02 2010 +0300 Fix --tcfile-in if timecode v2 file starts from nonzero pts commit 75110e63aba0876d35ffd37bf8edbb47639f9bc6 [revision 1805] Author: Sean McGovern Date: Fri Dec 3 22:30:51 2010 -0800 SPARC/Solaris build fixes commit ec1087a42bacd755d0a553fa13259d3af4add44b [revision 1804] Author: Fiona Glaser Date: Thu Nov 25 16:47:29 2010 -0800 Fix typo in r1797 commit cab2d8ad24e29c0d78d799a496608a44554c23f3 [revision 1803] Author: Tony Young Date: Wed Nov 24 16:58:38 2010 -0800 Add Python regression test script Patch from Google Code-In. commit 7e3019a3cef6710378c7d3090fa3d3348b59de6b [revision 1802] Author: Alex Wright Date: Wed Nov 24 02:19:51 2010 -0800 Make --weightp 1 a better speed tradeoff Since fade analysis is now so fast, weightp 1 now does fade analysis but no reference duplication. This is the opposite of what it used to do (reference duplication but no fade analysis). This also gives weightp's better fade quality to faster presets (up to superfast). commit aa5a32938309e649f0b0a258312c00719fb498c1 [revision 1801] Author: Daniel Kang Date: Tue Nov 23 20:29:37 2010 -0500 SSE versions of some high-bit-depth DCT functions Our first Google Code-In patch! commit 00524dfa8cef310b44a5e7dd3723c5072db5fa75 [revision 1800] Author: Anton Mitrofanov Date: Tue Nov 23 23:06:51 2010 +0300 Clean up weightp analysis function commit 580c5f2e7d9808850632a3748faa529e380bdc1b [revision 1799] Author: Fiona Glaser Date: Fri Nov 19 16:58:38 2010 -0800 Add API function to return max number of delayed frames commit dbed9592361abde6110c737d832367a0529815e7 [revision 1798] Author: Fiona Glaser Date: Thu Nov 25 13:01:33 2010 -0800 Copy field order flag in encoder_reconfig commit cbc37c5b89466abc620f5c41b8602d4db399936c [revision 1797] Author: Yasuhiro Ikeda Date: Wed Nov 24 23:09:54 2010 +0900 Cosmetics in configure commit 032880c9727bb1cefc4d1711c84152a5eca6fd07 [revision 1796] Author: Yasuhiro Ikeda Date: Mon Nov 22 11:01:57 2010 +0900 Add some more info to `x264 --version` commit ca8f00c7604cac37bd3103135521ebdc2d94340b [revision 1795] Author: Fiona Glaser Date: Sat Nov 20 23:30:42 2010 -0800 Change qpmin default to 0 There's probably no real reason to keep it at 10 anymore, and lowering it allows AQ to pick lower quantizers in really flat areas. Might help on gradients at high quality levels. The previous value of 10 was arbitrary anyways. commit 1cf769740c0fd143cb03b4290ba5238fce13eff6 [revision 1794] Author: Fiona Glaser Date: Thu Nov 25 13:01:16 2010 -0800 Fix ticks_per_frame check for VFR input commit b3dc88f65543b3f85661c71b7ffe96a6337b94f1 [revision 1793] Author: Steven Walters Date: Mon Nov 22 10:31:05 2010 +0900 Fix configure so that boolean configuration options are 1/0 There are many cases of 1/undef, not 1/0. commit d50e971f34ce2b18a2e11162126dc4de9a5d6c5e [revision 1792] Author: Sean McGovern Date: Sun Nov 21 01:59:33 2010 -0500 Only build SPARC VIS asm if high bit-depth is disabled commit 946f81551e3e8a1e6dc022a08788ba3004c9cf42 [revision 1791] Author: Sean McGovern Date: Sun Oct 10 19:34:18 2010 -0400 Fix build on SPARC Solaris 10 commit 13d9e7021bbaa07049c0e2c34cc1389c293daef0 [revision 1790] Author: James Darnley Date: Sun Nov 21 10:50:48 2010 +0100 Fix resize filter rounding code commit d9421c20385527c92236a82efeca0f6af4220d2f [revision 1789] Author: Anton Mitrofanov Date: Mon Nov 22 17:17:36 2010 +0300 Fix regression in chroma weightp Missing cache calls could cause artifacts, encoder/decoder desync. commit 729d9bcc46f3ab24f603ef7ab1603aee1669f32c [revision 1788] Author: Fiona Glaser Date: Fri Nov 19 15:40:23 2010 -0800 Fix some crashes with high bit depth Not all arrays were sufficiently aligned. commit f92aa4ecd9029c224afeaf9c59b3091602b6b641 [revision 1787] Author: Fiona Glaser Date: Sun Nov 14 03:34:26 2010 -0800 Chroma weighted prediction Like luma weighted prediction, dramatically improves compression in fades. Up to 4-8db chroma PSNR gain in extreme cases (short, perfect fade-outs). On actual videos, helps up to ~1% overall. One example video with a decent number of fades (ef OP): 0.8% bitrate reduction overall, 7% bitrate reduction just counting chroma. Fixes a lot of artifacts in fades at lower bitrates. Original patch by Dylan Yudaken . commit fa28f5b96dc61f13aec05ca75a24d0b34a5fc1b0 [revision 1786] Author: Fiona Glaser Date: Thu Nov 18 08:51:27 2010 -0800 Support custom cropping rectangles Supposedly useful for 3D television applications. commit 1382552b8c085af688e27f0417557ed69618051f [revision 1785] Author: Oskar Arvidsson Date: Sun Nov 14 16:46:01 2010 +0100 Convert X264_HIGH_BIT_DEPTH to HIGH_BIT_DEPTH Less verbose. commit abde94f64a2232f2ef6fb423d6138633442ef87a [revision 1784] Author: Oskar Arvidsson Date: Sat Oct 30 20:16:33 2010 +0200 x86 asm for high-bit-depth pixel metrics Overall speed change from these 6 asm patches: ~4.4x. But there's still tons more asm to do -- patches welcome! Breakdown from this patch: ~13x faster SAD than C. ~11.5x faster SATD than C (only MMX done). ~18.5x faster SA8D than C. ~19.2x faster hadamard_ac than C. ~8.3x faster SSD than C. ~12.4x faster VAR than C. ~3-4.2x faster intra SAD than C. ~7.9x faster intra SATD than C. commit 3afd514e222d4c4f0c984d258b1c17c0f12d6b89 [revision 1783] Author: Fiona Glaser Date: Sat Oct 30 19:13:05 2010 -0700 x86 asm for some high-bit-depth coefficient functions ~7.9x faster denoise than C. ~2.3x faster coeff_level_run than C. ~6.6x faster coeff_last than C. ~4.3x faster decimate_score than C. Also improve checkasm's decimate_score test. commit 612778d730df65acd4bc928aa3cd6770eb9c15e3 [revision 1782] Author: Oskar Arvidsson Date: Sun Nov 14 03:33:30 2010 +0100 x86 asm for high-bit-depth motion compensation ~8x faster qpel MC than C. ~10x faster hpel than C. commit 7946d913a6e3c9d83c2ace10a4f01c5b4052d618 [revision 1781] Author: Oskar Arvidsson Date: Thu Nov 4 02:13:43 2010 +0100 x86 asm for high-bit-depth quant ~3.1-4.2x faster than C. commit 03c61538ae77f5bd5f6c4b0c7a3fc6f41c48bcf1 [revision 1780] Author: Oskar Arvidsson Date: Sat Oct 30 16:55:48 2010 +0200 x86 asm for high-bit-depth DCT Only MMX and DCT done so far; iDCT still needs asm as well. ~4.4x faster than C. commit 515d560f84631bce4d12f04f47fe8074079de542 [revision 1779] Author: Oskar Arvidsson Date: Sat Oct 30 11:42:52 2010 +0200 x86 asm for high-bit-depth deblocking ~3.3x faster than C. commit 0016a8049a0e04a5719ddf24d0f03d1b332d7851 [revision 1778] Author: Oskar Arvidsson Date: Sat Nov 13 14:42:54 2010 +0100 Use a 16-bit buffer in hpel_filter regardless of bit depth This only works up to and including 10-bit (but we don't support anything higher yet). commit 0a6b2a688225a313fd934e5b01d48f7be3aa9f78 [revision 1777] Author: Henrik Gramner Date: Tue Nov 16 21:23:12 2010 +0100 Use enums instead of magic numbers in x264_mb_partition_pixel_table commit 866ac45c4bb7b2bdae99dab241b7b344dc07fbe7 [revision 1776] Author: Karl Blomster Date: Sun Nov 14 03:41:03 2010 -0800 Improve configure script logging Now prints the test program that failed in addition to error messages. commit 845e22876cf668abb3764d58dbb91af86c1895ac [revision 1775] Author: Anton Mitrofanov Date: Wed Nov 17 07:27:09 2010 -0800 Fix constrained intra pred mode selection commit 9c9f6340ebef4172eca5e5cf011826cadddb5012 [revision 1774] Author: Anton Mitrofanov Date: Wed Nov 17 02:46:30 2010 +0300 Various high-bit-depth ratecontrol fixes commit b2f40814b2ef3e16a83e017f02af5d91187a4797 [revision 1773] Author: Loren Merritt Date: Sun Nov 14 02:54:02 2010 -0800 Fix a crash in --dump-yuv for odd resolutions commit 8c08475df59f816145b2a8bef35039c5e11bd438 [revision 1772] Author: Anton Mitrofanov Date: Thu Nov 11 01:40:52 2010 +0300 Improve flash detection algorithm change in r1765 Now only disables scenecuts only near real end of video, not just prior to forced keyframes. commit 9df5214b5e5b5c3311d9e612fdebb2c36525648f [revision 1771] Author: Steven Walters Date: Wed Nov 10 07:21:41 2010 -0500 Update ffms2 support for its latest API break. commit afd79f1de5d63570243d6b6462b03cae8fc6c683 [revision 1770] Author: Fiona Glaser Date: Thu Nov 11 18:19:22 2010 -0800 Modify the x264 header accordingly if --disable-gpl is used commit 84bfe64ddf7bd22a579e76ae31562eed47a381e2 [revision 1769] Author: Fiona Glaser Date: Thu Nov 11 22:25:31 2010 -0800 Save a bit of memory with weightp + high bit depth commit 5d7b2ab6c51ebb32ca2dd999f9694f81baa750c2 [revision 1768] Author: Fiona Glaser Date: Sat Nov 13 04:38:44 2010 -0800 Fix bugs in qpfile parsing with omitted QPs commit 1e902646b2e1f470dadd268a4f45a699b10434ec [revision 1767] Author: Kieran Kunhya Date: Fri Nov 12 21:53:28 2010 +0000 Fix HRD with intra-refresh x264 was incorrectly calculating cpb_removal_delay with respect to the first keyframe. It should have been calculating cpb_removal_delay with respect to the last keyframe. commit 180d081f2f1ca4d92235fbc776211e75607ea0db [revision 1766] Author: Anton Mitrofanov Date: Wed Nov 10 07:34:40 2010 -0800 Fix bug in r1753 Overflow compensation fix broke CRF with --no-mbtree. commit e6c22fef723b44ecab5f597cf24f642c1d54741d [revision 1765] Author: Fiona Glaser Date: Sat Nov 6 17:47:27 2010 -0700 Improve flash detection's behavior near the end of the video Flash detection catches situations like AAAABBCCDDDD, where A,B,C,D are frames in different scenes. x264 would place a keyframe on the first "D". However, if the video ended on the last "C", x264 would place a keyframe on the first "C", even though C classifies as a flash. This change fixes this issue. commit 2f2ab0fa6c873c32363d7c3115f483fafdbe326f [revision 1764] Author: Fiona Glaser Date: Sun Oct 31 15:51:48 2010 -0700 Improve quantizer handling The default value for i_qpplus1 in x264_picture_t is now X264_QP_AUTO. This is currently 0, but may change in the future. qpfiles no longer use -1 to indicate "auto"; QP is just omitted. The old method should still work though. CRF values now make sense in high bit depth mode. --qp should be used for lossless mode, not --crf. --crf 0 will still work as expected in 8-bit mode, but won't be lossless with higher bit depths. Add bit depth to statsfiles. These changes are required to make the QP interface sensible in combination with high bit depth. commit d50a5bfd0b457c211a0ed2868b3e13be28dfa764 [revision 1763] Author: Fiona Glaser Date: Wed Nov 3 23:17:08 2010 -0700 VFR-aware PSNR/SSIM measurement First step to VFR-aware MB-tree and bit allocation. commit 506683ae4789946e45d328250e72b304810fdb0f [revision 1762] Author: Fiona Glaser Date: Mon Nov 1 15:08:03 2010 -0700 Disable weightp offset=-1 dupes with high bit depth They're a hack to compensate for crappy rounding, and thus not worth doing at high bit depth, which fixes most of the rounding issues. commit 6cff5834068758ecfcd38425f817320495bdd251 [revision 1761] Author: Fiona Glaser Date: Sun Nov 7 17:27:38 2010 -0800 Make the ffmpeg -vpre error message more descriptive commit 3d96daca538d849e0b9b88c45f8c3820aed9628e [revision 1760] Author: Loren Merritt Date: Sat Oct 30 14:39:50 2010 -0700 Add numeric names for the presets (0==ultrafast ... 9==placebo) This mapping will of course change if new presets are added in between, but will always be ordered from fastest to slowest. commit b9461a15b33936a6fd5583da843c132d4fe030f6 [revision 1759] Author: Fiona Glaser Date: Wed Oct 13 06:07:14 2010 -0700 Update benchmarks in doc/threads.txt commit af28501230c2c511aa8b33660ca8d35f50b613ea [revision 1758] Author: Loren Merritt Date: Thu Oct 28 13:29:42 2010 -0700 Make the #if'd out naive ESA actually match the real implementation commit 259790037c2f72532cf6d6fa9e632ad08ddf9574 [revision 1757] Author: Fiona Glaser Date: Mon Nov 1 19:19:23 2010 -0700 Move mv/ref prefetch code to the correct location Prefetching of top blocks should be done under if(top), not if(left). commit 633f938dcf0e5c69947483da18acae7e88fdd99a [revision 1756] Author: Reinhard Tartler Date: Tue Nov 9 23:57:12 2010 -0800 Link x264cli explicitly against lavf Fixes some problems with crappy linkers. commit 15f4006c6d478cdfe8e456de6aa1ecf35af40be0 [revision 1755] Author: Fiona Glaser Date: Mon Nov 8 22:14:58 2010 -0800 Fix CBR ratecontrol bug with extremely high qscales Caused CBR ratecontrol to take a very long time to recover from extreme situations (e.g. /dev/urandom). commit 0d6c3f3c3bbe89b6d1f215c86c102502ea8c201d [revision 1754] Author: Fiona Glaser Date: Mon Nov 8 21:03:01 2010 -0800 Disable overflow compensation in CRF mode Wasn't designed with CRF in mind, and acts really weird with CRF+VBV. commit 95f1474fcdb0714f24185a25ec16a7da6671f2a0 [revision 1753] Author: Fiona Glaser Date: Mon Nov 8 19:56:29 2010 -0800 Fix stupid bug in B-frame VBV size prediction commit 95268ca03408b2768ce2fcd768ef61d25ea6a1f6 [revision 1752] Author: Oskar Arvidsson Date: Fri Oct 29 13:13:25 2010 +0200 Fix regression in checkasm in r1666 Buffer is uint16_t* regardless of whether x264 was compiled with high bit depth or not. commit 8efd67c034190b415174fd03c3cfef4768345f11 [revision 1751] Author: Oskar Arvidsson Date: Fri Oct 29 13:11:09 2010 +0200 Fix overflows in satd, sa8d and hadamard_ac with high bit depth commit 803864ff6cf5b6869a94ee9915d886e8c372e72a [revision 1750] Author: Oskar Arvidsson Date: Fri Oct 29 12:34:42 2010 +0200 Fix potential problem with overflows in ssd_nv12 The risk of overflows increases exponentially with the bit depth. The 8-bit asm versions may still overflow with image widths >= 11008 (or 6604 if interlaced). commit 3db6b2c22cd1b8ee00c10ea6d705d6fbec8544d0 [revision 1749] Author: Fiona Glaser Date: Sat Oct 30 14:36:01 2010 -0700 Fix syntax for some parameterless functions Technically, such functions should be declared with (void), not (). commit 24a56d38867e848765c78605846a5d6097f5392c [revision 1748] Author: Steven Walters Date: Sat Oct 30 16:51:01 2010 -0400 Fix fps reporting on mingw64 _ftime on mingw64 uses __timeb32 which is broken. Use ftime instead. commit 30085672e58c29db2fa107b7dab58e10647b6722 [revision 1747] Author: Manuel Rommel Date: Sun Oct 31 19:19:10 2010 +0100 Fix compilation on PPC with some recent GCCs commit 3ffbfed7e8a45aeafde8eba55f944f280fe015aa [revision 1746] Author: Manuel Rommel Date: Mon Oct 11 13:50:09 2010 -0700 Fix Altivec SATD with small strides Fixes chroma ME and some of lookahead on PPC. commit 33edb51fbc5c2d040e1f0f6534d78ddbb8d11cae [revision 1745] Author: Holger Lubitz Date: Sun Oct 3 19:07:00 2010 +0200 Address remaining cacheline split issues in avg2 Slightly improved performance on core 2. Also fix profiling misattribution of w8/16/20 mmxext cacheline loops. commit 87829c982a751a0d031340d1ed7fbada23039d40 [revision 1744] Author: Fiona Glaser Date: Wed Sep 29 18:56:27 2010 -0600 Trim a few bytes off some x86 intra pred functions commit e4b44c2e267a8a0771777422c626aba51c8e5194 [revision 1743] Author: Yusuke Nakamura Date: Fri Oct 1 00:37:39 2010 +0900 Move DTS compression from libx264 to x264cli DTS compression is an ugly stupid hack and starting to encroach on unrelated areas like VBV. Some people want it in the mp4 muxer for devices and/or splitters that don't support Edit Boxes. We just say "throw these broken devices out the window". DTS compression will remain as a muxer option, --dts-compress, at the user's own risk. This option is disabled by default. commit 2eb4139f72d025fa4c77c4391d6f1b67ec2b6f8e [revision 1742] Author: Fiona Glaser Date: Thu Sep 30 22:24:51 2010 -0700 Use a larger pic_init_qp with high bit depth Modify pic_init_qs for consistency. commit 84bb443d53fb03d9899e266688d0af5587562c6c [revision 1741] Author: Fiona Glaser Date: Sat Oct 2 23:56:52 2010 -0700 Update some of the information in doc/ commit 49e5105aad41370fbf5611b23f5daf8caf128789 [revision 1740] Author: Fiona Glaser Date: Tue Sep 28 17:48:00 2010 -0700 Update header in depth.c commit ef905ad646dbabcc9c4d163862b9ab728db07a54 [revision 1739] Author: Fiona Glaser Date: Sat Oct 2 23:12:41 2010 -0700 Remove some old unused stuff in the build tree Regression test (hasn't been updated since svn). Doxy (was never used). commit 34b590b127cfd1eee13db826a9a9e2ac9faf6a20 [revision 1738] Author: Anton Mitrofanov Date: Wed Sep 29 00:19:06 2010 -0700 Various cosmetics Exorcise some CamelCase. commit 2bd8c8f5f6041a8169ad25a1aea4d48223893cc2 [revision 1737] Author: Fiona Glaser Date: Sun Oct 10 04:39:36 2010 -0700 Add missing mod4 stack check to sse2_misalign mc_chroma Required for ICC compilation. commit 0dbc490af39525adbefc1757151e5801c79eac3b [revision 1736] Author: Anton Mitrofanov Date: Fri Oct 8 18:08:23 2010 +0400 Fix 2pass ratecontrol with --nal-hrd cbr commit 8ee7b59a4afb230fba336bc08a9047f028708bfb [revision 1735] Author: Fiona Glaser Date: Mon Oct 4 13:33:23 2010 -0700 Fix minor bug in intra pred with intra refresh i8x8 blocks didn't properly avoid predicting from top-right when necessary. This could cause intra refresh to not completely refresh the frame. commit a86866106530d852722d6c5ddc6a9d4274351715 [revision 1734] Author: Anton Mitrofanov Date: Wed Sep 29 22:06:27 2010 +0400 Fix filter parsing with --extra-cflags="-DNDEBUG" commit 91b83f585a5cdc45a3fb83100b09a2fb9dacc02e [revision 1733] Author: Fiona Glaser Date: Wed Sep 29 00:15:14 2010 -0700 Make sigint handler variable volatile Didn't actually cause any problems, but is necessary because it can be modified by another thread (the signal call). commit 47e2609852b2de996071633c94de8d273b66ad05 [revision 1732] Author: Fiona Glaser Date: Sun Sep 26 21:04:30 2010 -0700 Add High 10 Intra profile support (AVC-Intra) x264 should now be able to encode compliant AVC-Intra 50. With a 10-bit-compiled version of x264, a sample commandline for 1080i25 might be: --interlaced --keyint 1 --vbv-bufsize 2000 --bitrate 50000 --vbv-maxrate 50000 --nal-hrd cbr Also print "Constrained Baseline" for baseline profile, since that's all x264 (and everything else in the world) supports. Also reorganize parameter validation a bit to reduce some spurious warnings. commit 0467589e35295c522bdae382e0e3b021deea9919 [revision 1731] Author: Oskar Arvidsson Date: Mon Sep 27 16:02:20 2010 +0200 Finish support for high-depth video throughout x264 Add support for high depth input in libx264. Add support for 16-bit colorspaces in the filtering system. Add support for input bit depths in the interval [9,16] with the raw demuxer. Add a depth filter to dither input to x264. commit b6b8aea6baaac8284a61f5879ba94a26a3cd6156 [revision 1730] Author: Alex Wright Date: Sun Sep 19 05:08:22 2010 -0700 Chroma mode decision/subpel for B-frames Improves compression ~0.4-1%. Helps more on videos with lots of chroma detail. Enabled at subme 9 (preset slower) and higher. commit 361721986f678065069d40c70bf57747afc0284c [revision 1729] Author: Fiona Glaser Date: Mon Sep 27 05:39:02 2010 -0700 Various cosmetics commit eacca4fa4c39c8140c3718ffbd82be4fb2baeba7 [revision 1728] Author: Fiona Glaser Date: Tue Sep 21 17:11:00 2010 -0700 Make slice-max-size more aggressive in considering escape bytes The x264 assumption of randomly distributed escape bytes fails in the case of CABAC + an enormous number of identical macroblocks. This patch attempts to compensate for this. It is probably safe to assume in calling applications that x264 practically never violates the slice size limitation. commit e02a6d46b8e3621a0c285ced56df727368ff1c7f [revision 1727] Author: Fiona Glaser Date: Mon Sep 27 05:39:13 2010 -0700 Add missing emms for dump-yuv commit 99c9b6de276dd499b6d56e50c679e4033eb915ad [revision 1726] Author: Anton Mitrofanov Date: Sat Sep 25 15:55:32 2010 -0700 Fix CFR ratecontrol with timebase != 1/fps Fixes VBV + DTS compression, among other things. commit f655f8ad1554cef6fc0040d7b7fa2fcf22ba3b15 [revision 1725] Author: Anton Mitrofanov Date: Mon Sep 20 13:10:13 2010 +0400 Fix DTS/bitrate calculation if the first PTS wasn't zero Fix bitrate calculation with DTS compression. commit 8c4218c159931b2cb04958d0510368168698421f [revision 1724] Author: Anton Mitrofanov Date: Sun Sep 19 19:11:06 2010 +0400 Fix regression in r1716 commit d2a886339597426c12a7ee9c6462bf89f85d91a6 [revision 1723] Author: Fiona Glaser Date: Sun Sep 19 00:25:27 2010 -0700 Cosmetics in me.c and frame.c commit 947e71c3dbee76383546a768d9bf84a3883efbd6 [revision 1722] Author: Kieran Kunhya Date: Mon Sep 13 15:09:06 2010 +0100 Add support for arbitrary user SEIs This allows calling applications to insert SEIs that x264 doesn't know about while maintaining HRD/VBV accuracy. commit e3af0b67fd87066bf55001754d82e759479fe9d6 [revision 1721] Author: Steven Walters Date: Wed Sep 15 20:42:08 2010 -0400 Add full chroma input flag to swscale Improves quality of colorspace conversions involving RGB(A). commit 3145e67de9ec38ab5432023286f285e467355c05 [revision 1720] Author: James Darnley Date: Fri Sep 17 04:06:59 2010 -0700 Add --disable-gpl option to configure Used for commercially-licensed versions of x264. Doesn't currently change anything, but may be used to disable GPL-only CLI tools, such as video filters, in the future. Also print the x264 license and libavformat license in version info. commit 213a99d070ebd4f9aeffe7cb3ed9bd7fe755ec7f [revision 1719] Author: Fiona Glaser Date: Fri Sep 17 04:03:27 2010 -0700 Update source file headers Update dates, improve file descriptions, make things more consistent. Also add information about commercial licensing. commit a35a495d8889b1265a558c06920b7e83c9cd1117 [revision 1718] Author: Fiona Glaser Date: Wed Sep 15 12:06:47 2010 -0700 Fix intra refresh to not exceed max recovery_frame_cnt The spec constrains recovery_frame_cnt to [0, MaxFrameNum-1]. So make MaxFrameNum bigger in the case of intra refresh. commit f1c48203a9985d05ed97c32c9ff9c9d76cd8d9c8 [revision 1717] Author: Fiona Glaser Date: Thu Sep 16 03:36:17 2010 -0700 Make intra refresh finish one frame faster In some cases, the last frame of intra refresh was redundant. Saves a few bits. commit 22f9984be388389e7f356e76074255797a4fed74 [revision 1716] Author: Fiona Glaser Date: Tue Sep 14 12:20:00 2010 -0700 Fix intra refresh to not predict from invalid pixels The blocks on the right side of the intra refresh column should not predict from top-right. commit 90c51a765a42e5db6ccf558695fc6abf54f7e1cd [revision 1715] Author: Steven Walters Date: Mon Sep 13 18:47:33 2010 -0400 Add configure check for mingw64 prefixing This compensates for the inconsistent prefixing seen in different versions of the compiler. commit 683abb46df51083cc4d9da2b73bad39421c510eb [revision 1714] Author: Manuel Rommel Date: Sat Sep 4 19:31:53 2010 -0700 Update some Altivec function prototypes Silences a lot of warnings. commit ceba5dd5aa576d5d9f4d7a4213303a07bce91c15 [revision 1713] Author: Takashi Hirata Date: Mon Aug 30 18:13:49 2010 +0900 Add support for level 1b This level is a stupid hack in the H.264 spec, so it's a stupid hack in x264 too. Since level is an integer, calling applications need to set level_idc=9 to use it. String-based option handling will accept "1b" just fine though, so CLI users don't have to worry. commit 818532b1b048ac98129f5aaee7f2322f407ae482 [revision 1712] Author: Fiona Glaser Date: Thu Sep 2 15:29:29 2010 -0700 Use smaller values for idr_pic_id Saves a few bits and fixes problems on certain fantastically terrible decoders, such as the Apple iPad. commit d48f5f8c83a6ea5fbdd309b2ab6284bdf96550a9 [revision 1711] Author: Fiona Glaser Date: Mon Aug 30 12:32:31 2010 -0700 Use POC type 2 for streams with no B-frames Saves a few bits per slice header. commit 1a579035c7e093e236debc9649bff7362ef9782f [revision 1710] Author: Fiona Glaser Date: Sun Aug 29 22:18:07 2010 -0700 Faster cabac_encode_ue_bypass Use CLZ + a lut instead of a loop. commit 270c72d43aa58293fcbac257b00701f5ae6b103a [revision 1709] Author: Henrik Gramner Date: Wed Sep 1 00:53:42 2010 +0200 Faster nal_escape asm commit 24a964f7f59bf6e501d0612c8c82b6d8b13fd033 [revision 1708] Author: Anton Mitrofanov Date: Tue Aug 31 08:45:22 2010 -0700 Allow --demuxer forcing with known extensions commit 51df06a4bb61cd72d7f74c08768b2dd706da8322 [revision 1707] Author: Anton Mitrofanov Date: Fri Sep 3 13:33:44 2010 -0700 Minor fixes/cosmeticcs in commandling parsing commit 3f46301fa2c26472beac370e85a03c1117abc2a4 [revision 1706] Author: Anton Mitrofanov Date: Fri Sep 3 08:39:48 2010 -0700 Fix overflow in stats printing commit de0dd4aa8b593e390982d28e0906d067c1c7ede2 [revision 1705] Author: Anton Mitrofanov Date: Sun Aug 29 16:35:32 2010 +0400 Fix bug in 2pass if the first P-frames are all skip last_qscale_for was read before being initialized in this case, resulting in the value from the previous iteration being used instead. commit 9fd187e63a2296a3334abc3f9ef1ade458ff671c [revision 1704] Author: Fiona Glaser Date: Thu Aug 26 09:12:01 2010 -0400 Don't do deblock-aware RD if deblocking is off commit 268618932b0c065c1ab1eea26311f35937073c58 [revision 1703] Author: Fiona Glaser Date: Sat Aug 21 00:15:53 2010 -0700 CAVLC "trellis" ~3-10% improved compression with CAVLC. --trellis is now a valid option with CAVLC. Perhaps more importantly, this means psy-trellis now works with CAVLC. This isn't a real trellis; it's actually just a simplified QNS. But it takes enough shortcuts that it's still roughly as fast as a trellis; just not quite optimal. Thus the name is a bit of a misnomer, but we're reusing the option name because it does the same thing. A real trellis would be better, but CAVLC is much harder to trellis than CABAC. I'm not aware of any published polynomial-time solutions that are significantly close to optimal. commit 34649ace2ce4cb4bde4bcbdc78919dfca358d6a3 [revision 1702] Author: Fiona Glaser Date: Sat Aug 21 16:51:39 2010 -0500 Add global #define for maximum reference count This should make it easier to play around with reference frame counts that exceed the spec maximum. commit 2846aaa76f3e20b73225eaaa3f710ad701652152 [revision 1701] Author: Fiona Glaser Date: Mon Aug 16 17:47:11 2010 -0700 Simplify addressing logic for interlaced-related arrays In progressive mode, just make [0] and [1] point to the same place. commit da6c3ecc955a5a7757efed84c542defd5b0fcc9b [revision 1700] Author: Fiona Glaser Date: Mon Aug 23 18:59:35 2010 -0400 Add missing emms to x264_nal_encode Only matters for applications using the low-latency callback feature. commit ee62228587849cfbee75a37300515bb8d84a4f71 [revision 1699] Author: Fiona Glaser Date: Tue Aug 17 14:38:41 2010 -0700 Fix 2 bugs with slice-max-size Macroblock re-encoding didn't restore mv/tex bit counters (slightly inaccurate 2-pass). Bitstream buffer check didn't work correctly (insanely large frames could break encoding). commit 8782049fa11d6d8451b88e553ab33707bcc9b6b8 [revision 1698] Author: Manuel Rommel Date: Thu Aug 12 12:54:00 2010 -0700 NV12 version of Altivec chroma MC commit c9f17d9378245ad37bbad07aa5b4915b169292ff [revision 1697] Author: Fiona Glaser Date: Tue Aug 10 16:55:05 2010 -0700 Deblock-aware RD Small quality gain (~0.5%) at lower bitrates, potentially larger with QPRD. May help more with psy, maybe not. Enabled at subme >= 9. Small speed cost (a few %). commit 8b2e4a080a469d8b22a275b8afef3b19b566a4e9 [revision 1696] Author: Brad Smith Date: Sun Aug 8 18:13:32 2010 -0400 Correct X header path usage in configure Don't unconditionally set the header path for OpenBSD but do so if the --enable-visualize flag is specified. commit 2cbe0df7aa31349dfe2903c409d089bda40c978d [revision 1695] Author: golgol7777 Date: Sat Aug 7 23:01:46 2010 -0700 Fix lavf input with delayed frames commit 18de9f673aaaf28bdc160f1e98c221a3300d011c [revision 1694] Author: Alexander Strange Date: Sat Aug 7 22:29:12 2010 -0700 Slightly improve the filtering section of x264 --help commit cb1ab4495ecf8b614e23f13ef4429474c0e3ab7c [revision 1693] Author: Fiona Glaser Date: Sat Aug 7 22:32:06 2010 -0700 Fix debug message typo with DTS compression commit 57543fa6758532287d11583e42e475972790ee5c [revision 1692] Author: Yasuhiro Ikeda Date: Tue Aug 3 22:10:15 2010 +0900 Try to guess input length for lavf input Allows printing of progress indicator when using lavf input. commit a82a0ce26f47460b5cb1487ab758ff04c83b859a [revision 1691] Author: Yasuhiro Ikeda Date: Tue Aug 3 22:07:36 2010 +0900 Workaround bug in fps/timestamp handling with lavf input reordered_opaque in lavf doesn't work correctly in the identity case (no reordering). Fixes incorrect output for some file types (e.g. raw in mov). commit 9c5a15e467a4a9d2ca804ab03d3925ff21f10390 [revision 1690] Author: Mike Matsnev Date: Sun Aug 1 12:08:20 2010 -0700 Fix aspect ratio writing in the MKV muxer The braindead Matroska spec dictates aspect ratio to be measured in pixels instead of, well, an actual aspect ratio. commit 4ad7cc5a431d3b3e6cb125d02d7fed4932e19542 [revision 1689] Author: Anton Mitrofanov Date: Thu Jul 29 20:23:55 2010 +0400 Add libavcore check in configure commit 3882fdba49bec819eb9a8dc851224683e79b3fab [revision 1688] Author: Fiona Glaser Date: Mon Jul 26 15:38:13 2010 -0700 Improve quantizer distribution with sliced-threads+VBV Should help avoid cases of very uneven quantizer choice between slices. commit 445082856fec2be71d33d6415801317ceecc0cbd [revision 1687] Author: Fiona Glaser Date: Wed Jul 28 11:42:06 2010 -0700 Remove dead code in slicetype.c commit abaa820979d4f0c2fc5944d30135e814a230d2d8 [revision 1686] Author: golgol7777 Date: Wed Jul 28 00:54:38 2010 +0900 Fix incorrect duration/framerate/bitrate in flv header commit acd70bf7b9c922a00df7c96f248a17d419f8ed3d [revision 1685] Author: Fiona Glaser Date: Wed Jul 28 14:23:53 2010 -0700 invalidate_reference fixes invalidate_reference didn't actually invalidate the immediate previous frame, only frames that came before that. Make sure that reordering is forced when invalidate_reference is used, so that the reference list is correct decoder-side. commit b476e0583896124e6ec33ccf7756b240deae0d96 [revision 1684] Author: Steven Walters Date: Sun Jul 25 19:45:27 2010 -0400 Filtering system-related fixes Fix configure to check for outdated libavutil in resize filter support. Do not print an explicit error message in ffms when requesting a frame beyond the number of frames in the source. Mention in --*help that filtering options can be specified as name=value. Fix the shadowing warning in the resize filter on posix systems. commit 5cbf8cf1ee08772a75278c5d9b5cd8d39874e3bb [revision 1683] Author: Fiona Glaser Date: Wed Jul 21 17:40:14 2010 -0700 Improve reference_invalid support Reference invalidation can now be used to invalidate multiple frames at a time, rather than being limited to one per encoder_encode call. commit 0d70de178aba32900c671c4af62808147c17570e [revision 1682] Author: Loren Merritt Date: Thu Jul 22 06:40:12 2010 +0000 Eradicate all mention of SI/SP-frames commit f57ef856c2d88b1201571a206eb78ff10a93c57f [revision 1681] Author: Fiona Glaser Date: Wed Jul 21 11:25:11 2010 -0700 Fix stack alignment with MB-tree Broke 2-pass with MB-tree when calling from compilers with broken stack alignment (e.g. MSVC). commit 7ff60b1bbb0fc6d3aaf7c3f2bffc8833ef763cd0 [revision 1680] Author: Steven Walters Date: Sat Jul 17 17:43:37 2010 -0400 Avisynth 2.6 colorspace support Use a customized avisynth_c.h to detect the new planar colorspaces. commit b9dd9cfb16ae3e8cd59fd9e71abfeff5b4a5ae2f [revision 1679] Author: Loren Merritt Date: Thu Jul 15 23:49:03 2010 -0700 Prevent some cases of cache aliasing. Avoid cases where image strides were a large power of 2. Core 2: +3% speed at widths 898..960, +6% at widths 1922..1984, most other resolutions unaffected. Nehalem and AMD: similar amount of speedup, but fewer resolutions affected. commit 22655d902e35b0eb8529803c8b4b80dce4a3428e [revision 1678] Author: Fiona Glaser Date: Thu Jul 15 19:35:52 2010 -0700 Fix stack alignment for adaptive quant Broke calls from compilers with broken stack alignment (e.g. MSVC). commit 96383a3bc717fee2476e6b1e3ff4a708152a37e2 [revision 1677] Author: David Conrad Date: Thu Jul 15 18:58:28 2010 -0400 Fix compilation with shared ffmpeg libs lavf input uses libavutil functions, so it must request flags for libavutil from pkg-config. commit e3687899bc45a49083acaaec116e7c70bbcfca37 [revision 1676] Author: Fiona Glaser Date: Thu Jul 15 13:20:50 2010 -0700 Fix another PCM bug CABAC assumes that NNZ is 0 or 1, not the number of actual nonzero coefficients. Didn't actually break the output; only had a tiny effect on RD. commit f98fed632827610c80950fc201b7a5306968ffe9 [revision 1675] Author: Oskar Arvidsson Date: Thu Jul 15 14:01:36 2010 +0200 Fix regression in r1666 Broke encoding of PCM macroblocks. commit a026b7c6ae4e4cba03323bfa7b7af573370101fb [revision 1674] Author: Oskar Arvidsson Date: Thu Jul 15 08:04:47 2010 +0200 Fix build with bit_depth > 8 Definition of x264_cli_plane_copy was inconsistent with declaration. commit 387828eda87988ad821ce30f818837bd4280bded [revision 1673] Author: Loren Merritt Date: Thu Jul 8 12:24:16 2010 -0700 Convert x264 to use NV12 pixel format internally ~1% faster overall on Conroe, mostly due to improved cache locality. Also allows improved SIMD on some chroma functions (e.g. deblock). This change also extends the API to allow direct NV12 input, which should be a bit faster than YV12. This isn't currently used in the x264cli, as swscale does not have fast NV12 conversion routines, but it might be useful for other applications. Note this patch disables the chroma SIMD code for PPC and ARM until new versions are written. commit c58954cc7c2516dd5f704a506da9fe824f34d9df [revision 1672] Author: Steven Walters Date: Mon Jul 5 17:37:47 2010 -0400 Add video filtering system to x264cli Similar to mplayer's -vf system. Supports some basic operations like resizing and cropping. Will support more in the future. See the help for more details. commit da978ebe60f4d3e08cff46704762d2471d280508 [revision 1671] Author: Fiona Glaser Date: Tue Jul 6 13:39:44 2010 -0700 Eliminate edge cases for MV predictors Saves a few clocks in mv pred. commit b4217e40d4ab41499f83cbcfa9542a9b0500835d [revision 1670] Author: Fiona Glaser Date: Thu Jul 8 12:45:25 2010 -0700 Improve scenecut detection a bit Put a minimum value on the scenecut threshold; makes x264 more likely to catch successive scenecuts (but might increase the odds of false detection). This also fixes scenecut detection with keyint=infinite. Also print keyint=infinite in the x264 SEI and statsfile correctly. commit de8a6e9d6b2bb2a0961abcaad4edc43d74702df9 [revision 1669] Author: Fiona Glaser Date: Wed Jul 14 18:47:14 2010 -0700 Fix 8x8dct+slices+no sliced threads+cavlc+deblock Deblocking was done slightly incorrectly. Regression in r1612. commit b7f6428a0b188d9118626dd4dc64415266ce8b6f [revision 1668] Author: Fiona Glaser Date: Thu Jul 8 16:20:48 2010 -0700 Fix off-by-one error in slice VBV predictor updates commit d6d614dee05b55e75b0f2938cc170a9eac312db2 [revision 1667] Author: Anton Mitrofanov Date: Mon Jul 5 17:44:15 2010 +0400 Fix disabling of progress with --log-level commit c91f43a4b09dab84953f417e6d6662ec0fa7acb1 [revision 1666] Author: Oskar Arvidsson Date: Fri Jul 2 04:06:08 2010 +0200 Support for 9 and 10-bit encoding Output bit depth is specified on compilation time via --bit-depth. There is currently almost no assembly code available for high-bit-depth modes, so encoding will be very slow. Input is still 8-bit only; this will change in the future. Note that very few H.264 decoders support >8 bit depth currently. Also note that the quantizer scale differs for higher bit depth. For example, for 10-bit, the quantizer (and crf) ranges from 0 to 63 instead of 0 to 51. commit b7789b1f08e27103576d9b9f0feea9b75e2eca56 [revision 1665] Author: Fiona Glaser Date: Wed Jun 30 13:55:46 2010 -0700 Support infinite keyint (--keyint infinite). This just means x264 won't insert non-scenecut keyframes. Useful for streaming when using interactive error recovery or some other mechanism that makes keyframes unnecessary. Also change POC logic to limit POC/framenum LSB size (to save bits per slice). Also fix a bug in the CPB underflow detection code (didn't affect the bitstream, just resulted in the failure to print certain warning messages). commit e480c9c8b143422e3d51cb0abbb9e6578888852a [revision 1664] Author: Fiona Glaser Date: Wed Jun 30 13:06:22 2010 -0700 Don't check i16x16 planar mode unless previous modes were useful Saves ~160 clocks per MB at subme=1, ~270 per MB at subme>1 (measured on Core i7). Negligle effect on compression. Also make a few more arrays static. commit 43a4334670ae60d0f8a30d3e4bd530d3b90a1ce1 [revision 1663] Author: Steven Walters Date: Sat Jun 26 16:28:49 2010 -0400 Centralize logging within x264cli x264cli messages will now respect the log level they pertain to. Slightly reduces binary size. commit 899bf0fdb91d85acf0fde88b9aa1cb01755e8c71 [revision 1662] Author: Lamont Alston Date: Tue Jun 29 10:11:42 2010 -0700 Make open-GOP Blu-ray compatible Blu-ray is even more braindamaged than we thought. Accordingly, open-gop options are now "normal" and "bluray", as opposed to display and coded. Normal should be used in all cases besides Blu-ray authoring. commit 4cd44841f5ea8816f81a7975480cea6da10ad1f5 [revision 1661] Author: Fiona Glaser Date: Mon Jun 28 15:02:33 2010 -0700 Callback feature for low-latency per-slice output Add a callback to allow the calling application to send slices immediately after being encoded. Also add some extra information to the x264_nal_t structure to help inform such a calling application how the NAL units should be ordered. Full documentation is in x264.h. commit a0ce4b768d46137692531e2869800d7d3c419e42 [revision 1660] Author: Loren Merritt Date: Sat Jun 26 20:55:59 2010 -0700 Simplify pixel_ads commit edc1135e59416b4311f54375b6659e7340c81193 [revision 1659] Author: Fiona Glaser Date: Wed Jun 23 17:29:34 2010 -0700 Interactive encoder control: error resilience In low-latency streaming with few clients, it is often feasible to modify encoder behavior in some fashion based on feedback from clients. One possible application of this is error resilience: if a packet is lost, mark the associated frame (and any referenced from it) as lost. This allows quick recovery from errors with minimal expense bit-wise. The new i_dpb_size parameter allows a calling application to tell x264 to use a larger DPB size than required by the number of reference frames. This lets x264 and the client keep a large buffer of old references to fall back to in case of lost frames. If no recovery is possible even with the available buffer, x264 will force a keyframe. This initial version does not support B-frames or intra refresh. Recommended usage is to set keyint to a very large value, so that keyframes do not occur except as necessary for extreme error recovery. Full documentation is in x264.h. Move DTS/PTS calculation to before encoding each frame instead of after. Improve documentation of x264_encoder_intra_refresh. commit 669cc1def2034a7ef55946df9f6e1ae13963eb8a [revision 1658] Author: Fiona Glaser Date: Thu Jun 17 14:50:07 2010 -0700 Lookaheadless MB-tree support Uses past motion information instead of future data from the lookahead. Not as accurate, but better than nothing in zero-latency compression when a lookahead isn't available. Currently resets on keyframes, so only available if intra-refresh is set, to avoid pops on non-scenecut keyframes. Not on by default with any preset/tune combination; must be enabled explicitly if --tune zerolatency is used. Also slightly modify encoding presets: disable rc-lookahead in the fastest presets. Enable MB-tree in "veryfast", albeit with a very short lookahead. commit d020c4274edab45314c6bcf324d05f21dd13a93c [revision 1657] Author: Lamont Alston Date: Wed Jun 16 10:05:17 2010 -0700 Open-GOP support Allows B-frames immediately prior to keyframes (in display order). This helps reduce keyframe popping and improve compression with short keyframe intervals. Due to a staggering display of braindamage in the Blu-ray spec, two open-GOP modes are available. The two modes calculate keyframe interval differently: one based on coded distance and one based on display distance. The latter is superior compression-wise, but for no comprehensible reason, Blu-ray requires the former if open-GOP is used. commit 81cada8effc3e91eec3f413772b3c1629e8beb4d [revision 1656] Author: Steven Walters Date: Wed Jun 9 18:14:52 2010 -0400 Use threadpools to avoid unnecessary thread creation Tiny performance improvement with fast settings and lots of threads. May help more on some OSs with slow thread creation, like OS X. Unify inconsistent synchronized abbreviations to sync. commit 1a3548cf7bbebe7aa69f2ec65f6d36dc08afafc8 [revision 1655] Author: Fiona Glaser Date: Sat Jun 19 01:41:07 2010 -0700 Improve 2-pass bitrate prediction Adapt based on distance to the end in bits, not in frames. Helps in videos with absurdly simple end sections, e.g. black frames. commit af34dfe35f6421dd0cd262d2263111f5f1a11f2d [revision 1654] Author: Fiona Glaser Date: Fri Jun 18 13:58:11 2010 -0700 SSE4 and SSSE3 versions of some intra_sad functions Primarily Nehalem-optimized. commit 5a57688fa282c31f070f147790dec0793adc843b [revision 1653] Author: Fiona Glaser Date: Sat Jun 19 03:27:33 2010 -0700 Improve HRD accuracy In a staggering display of brain damage, the spec requires all HRD math to be done in infinite precision despite the output being of quite limited precision. Accordingly, convert buffer management to work in units of timescale. These accumulating rounding errors probably didn't cause any real problems, but might in theory cause issues in very picky muxers on extremely long-running streams. commit f2b78b93fa6568528a2ee0efb0b00834002df49a [revision 1652] Author: Fiona Glaser Date: Tue Jun 22 14:20:46 2010 -0700 Use -fno-tree-vectorize to avoid miscompilation Some versions of gcc have been reported to attempt (and fail) to vectorize a loop in plane_expand_border. This results in a segfault, so to limit the possible effects of gcc's utter incompetence, we're turning off vectorization entirely. It's not like it ever did anything useful to begin with. commit 8060431f0d60f97a9b5274ceb230fbcdb3e2cffd [revision 1651] Author: Anton Mitrofanov Date: Sat Jun 19 01:44:56 2010 +0400 Fix SIGPIPEs caused by is_regular_file checks Check to see if input file is a pipe without opening it. commit ed0b9b6df2bbb11268da9b1b4e7d3b217bc0b5c7 [revision 1650] Author: Fiona Glaser Date: Tue Jun 15 05:15:42 2010 -0700 Fix compilation on ARM w/ Apple ABI commit 15501e340f0500eedb797390f74e6e35f58ba12e [revision 1649] Author: Holger Lubitz Date: Wed Jun 9 13:59:06 2010 +0200 Faster mbtree_propagate asm Replace fp division by multiply with the reciprocal. Only ~12% faster on penryn, but over 80% faster on amd k8. Also make checkasm slightly more tolerant to rounding error. commit c224a0a5d91fb9115071b4ff075e91d2b3f630e2 [revision 1648] Author: Diogo Franco Date: Sun Jun 13 21:57:32 2010 -0300 Convert the OPT_ defines in x264.c to an enum commit 43792a1b1bbd00193bf26c21f37be03777b8eb6d [revision 1647] Author: Anton Mitrofanov Date: Sun Jun 13 23:14:15 2010 +0400 Don't allow baseline profile streams with fake-interlaced Indicate use of --fake-interlaced in encoding options SEI. commit 317001c67caffc6618c0b736d22e43795f1efed3 [revision 1646] Author: Havoc Pennington Date: Thu Jun 10 16:28:52 2010 -0400 Allocate space for null terminator in param_apply_tune commit 4fda9276ff98445f491d316a51f8d2c89ec2d85b [revision 1645] Author: Anton Mitrofanov Date: Thu Jun 10 21:33:46 2010 +0400 Fix regression in r1501. Could cause slightly incorrect analysis in rare cases, but no serious encoding issues. Also shut up gcc warning about pels_v. commit 9d36bceed51ac71b5cbf645b4900b5a7190840a0 [revision 1644] Author: Anton Mitrofanov Date: Wed Jun 9 22:53:08 2010 +0400 Fix crash with --subme 0 + --weightp > 0. Regression in r1535 commit ffdb014fa27b703046349ba70f432b883d927a70 [revision 1643] Author: Henrik Gramner Date: Tue Jun 8 16:29:16 2010 +0200 Replace some divisions with shifts commit 64fa70e78c459688197ce907326b21bf4799f8ab [revision 1642] Author: Anton Mitrofanov Date: Tue Jun 8 02:43:37 2010 +0400 Warn about shadowed variable declarations Also get rid of a few instances of variable shadowing. commit 1bc9ad14e4655780bdf509c6d29f4b1e9d447fe4 [revision 1641] Author: Fiona Glaser Date: Mon Jun 7 14:26:05 2010 -0700 Template load_pic_pointers based on interlaced Significantly speeds up cache_load in the non-interlaced case. Also various other minor optimizations in cache_load and cache_save. commit fcda8dd98eaf9ceea950c95661f8228fb364fc0b [revision 1640] Author: Fiona Glaser Date: Mon Jun 7 14:15:33 2010 -0700 Remove double-dereferences for MB width/height data Store it in x264_t instead of going through the SPS. commit 894634306a53830d6f6f7a8d0ef927af414b3aad [revision 1639] Author: Steven Walters Date: Sat May 22 20:54:35 2010 -0400 Exempt Win x86_64 from memalign hack The API mandates all mallocs are 16 byte aligned. Remove unused int that stores sizeof malloc in memalign hack. commit f9bc2de28f637fa199424f544c94aeabc551eeb4 [revision 1638] Author: Steven Walters Date: Fri Jun 4 13:44:55 2010 -0700 Preprocessing cosmetics Unify input/output defines to HAVE_* format. Define values as 1 to simplify conditionals. commit 691e2db1ff45f98e9696a5b37b761da7d03a64f3 [revision 1637] Author: Fiona Glaser Date: Thu Jun 3 21:31:10 2010 -0700 Take more shortcuts in i4x4/i8x8 analysis Based on the scores of the H and V modes, rule out modes which are unlikely. Small compression loss (0.1-0.5%) and large speed gain (10-30% faster intra analysis). Not enabled in slower encoding modes. Also make C versions of the merged SATD functions in order to eliminate branches based on their availability. commit 3cd5117da50bc1925086f684f34d7d5422d28520 [revision 1636] Author: Fiona Glaser Date: Wed Jun 2 15:47:26 2010 -0700 Display SSIM measurement in db as well commit bef006e64034f3d3d24fdb1b06a9ac605eae9e64 [revision 1635] Author: Anton Mitrofanov Date: Tue Jun 8 01:03:03 2010 +0400 Make version.sh indicate "M" for local commits too commit 5ab417745cce40869ec59eb28fde8677e974c249 [revision 1634] Author: Alex Jurkiewicz Date: Sun Jun 6 15:21:12 2010 +0800 Add error message for invalid [de]muxer selection commit 4f5d9bcea757f049c1a14dd902c4af76dee231c1 [revision 1633] Author: Nathan Caldwell Date: Sun Jun 6 14:19:41 2010 -0600 Deduplicate the ALIGN macro, move it to common.h commit e02e3eb59e2ab921117b89bf302ac70b7628baa9 [revision 1632] Author: David Conrad Date: Thu Jun 3 19:02:24 2010 -0400 Fix a use of ALIGNED_ARRAY_16 on ARM commit 032113205fd60e70b7e50b5109e94ec2062067e9 [revision 1631] Author: Fiona Glaser Date: Tue Jun 8 15:41:17 2010 -0700 Add missing emms after nal_encode Caused random, bizarre failures with some calling applications. commit 23a20180338226f8bcba05c46867f38eff750cc3 [revision 1630] Author: Fiona Glaser Date: Tue Jun 8 15:38:32 2010 -0700 Fix crash in fake-interlaced at some resolutions commit da1bc99cd4c74499aca99cbfbfc014154bb32440 [revision 1629] Author: Yusuke Nakamura Date: Wed Jun 2 22:27:57 2010 +0900 Fix no-mbtree + aq-mode=0 Regression in r1618. commit 36bbd4d2134106943b7496b376603a97010ce308 [revision 1628] Author: Fiona Glaser Date: Wed Jun 2 01:07:44 2010 -0700 Add API function to fix x264_picture_t initialization Calling applications that do not use x264_picture_alloc need to use x264_picture_init to initialize x264_picture_t structures. Previously, if the calling application didn't zero x264_picture_t, Bad Things could happen. commit f857c08db810a10332b46f4c331ac098ec3db9c7 [revision 1627] Author: Yusuke Nakamura Date: Wed Jun 2 17:02:31 2010 +0900 Fix Avisynth input Regression in r1624. A more permanent solution to the problem will be committed later. commit e46bf243d4c05f9abb106573b4c46d4fe88caba2 [revision 1626] Author: Oskar Arvidsson Date: Wed Jun 2 02:08:45 2010 +0200 Convert to a unified "dctcoeff" type for DCT data Necessary for future high bit-depth support. commit 17a04af4e35de32822024caf91e6f75400593394 [revision 1625] Author: Oskar Arvidsson Date: Wed Jun 2 01:35:38 2010 +0200 Convert to a unified "pixel" type for pixel data Necessary for future high bit-depth support. Various macros and extra types have been introduced to make operations on variable-size pixels more convenient. commit 7adf25b165b4c6c69c3bcba7ed949996dca6f116 [revision 1624] Author: Fiona Glaser Date: Fri May 28 14:27:22 2010 -0700 Add API tool to apply arbitrary quantizer offsets The calling application can now pass a "map" of quantizer offsets to apply to each frame. An optional callback to free the map can also be included. This allows all kinds of flexible region-of-interest coding and similar. commit 6589ad6dc6a2ac7599c5a19566306c274bd86853 [revision 1623] Author: Fiona Glaser Date: Thu May 27 14:27:32 2010 -0700 x86 assembly code for NAL escaping Up to ~10x faster than C depending on CPU. Helps the most at very high bitrates (e.g. lossless). Also make the C code faster and simpler. commit 9056470d688eeb0f337a1976576b3dac601d882c [revision 1622] Author: Fiona Glaser Date: Fri May 28 14:30:07 2010 -0700 Re-enable i8x8 merged SATD Accidentally got disabled when intra_sad_x3 was added. commit 156b119f3de35e458f037e87d1ccf467ad86da5b [revision 1621] Author: Henrik Gramner Date: Sun May 30 22:45:14 2010 +0200 Some deblocking-related optimizations commit 260da1ce37ce8964b5a7dc697723d064d60b335e [revision 1620] Author: Henrik Gramner Date: Thu May 27 22:18:38 2010 +0200 Optimize out some x264_scan8 reads commit 9dc7a03fa5bd36862c456e1b9b2cba238cb3c89c [revision 1619] Author: Fiona Glaser Date: Thu May 27 10:42:15 2010 -0700 Add fast skip in lookahead motion search Helps speed very significantly on motionless blocks. commit 0010a130bf8939cc66e576fa53b7a7ad94fe32f3 [revision 1618] Author: Fiona Glaser Date: Wed May 26 12:55:35 2010 -0700 Merge some of adaptive quant and weightp Eliminate redundant work; both of them were calculating variance of the frame. commit 19e1f24e09ee540a8517bd6d36de5c1b828c24b6 [revision 1617] Author: Fiona Glaser Date: Thu May 27 12:31:41 2010 -0700 Fix omission in libx264 tuning documentation commit ccd20017c79b00abf61e9009e8b28a5eb440c985 [revision 1616] Author: Fiona Glaser Date: Sun May 30 09:42:53 2010 -0700 Fix ultrafast to actually turn off weightb commit cb94c2bdc9989d55deb6899ab77b0d40d185ab21 [revision 1615] Author: Anton Mitrofanov Date: Mon May 31 22:36:50 2010 +0400 Fix crash with MP4-muxing if zero frames were encoded commit 16adb51780fa73f260eb26f75715faf4b2cd9cb8 [revision 1614] Author: Fiona Glaser Date: Mon May 31 11:14:22 2010 -0700 Fix cavlc+deblock+8x8dct (regression in r1612) Add cavlc+8x8dct munging to new deblock system. May have caused minor visual artifacts. commit f8bd69dc667aec425f84c9b5d13dbf85d08d5e05 [revision 1613] Author: Fiona Glaser Date: Wed May 26 12:40:31 2010 -0700 Fix 10L in r1612 Stats need to be calculated before deblock strength, not after. Broke ref stats in x264cli (no affect on actual output). commit 4947b0fbe0882defe5f806a0c42978bd160d6da0 [revision 1612] Author: Fiona Glaser Date: Tue May 25 12:42:44 2010 -0700 Overhaul deblocking again Move deblock strength calculation to immediately after encoding to take advantage of the data that's already in cache. Keep the deblocking itself as per-row. commit 57729402c7b34d91cab058c00a5f6e50a2ef72a3 [revision 1611] Author: Fiona Glaser Date: Tue May 25 16:13:59 2010 -0700 Detect Atom CPU, enable appropriate asm functions I'm not going to actually optimize for this pile of garbage unless someone pays me. But it can't hurt to at least enable the correct functions based on benchmarks. Also save some cache on Intel CPUs that don't need the decimate LUT due to having fast bsr/bsf. commit 0f249f12470cef5187674f13bf2cfcb3938f8563 [revision 1610] Author: Fiona Glaser Date: Mon May 24 11:13:22 2010 -0700 Slightly faster mbtree asm commit 4d41be9b18375a19a020c75e19db35c3d41834b3 [revision 1609] Author: Fiona Glaser Date: Fri May 21 15:39:38 2010 -0700 Faster deblock strength asm on conroe/penryn commit 8423fd9ea4cc1b57478a294a77aa725d025fbfee [revision 1608] Author: Fiona Glaser Date: Fri May 21 14:32:13 2010 -0700 Avoid an extra var2 in chroma encoding if possible Also remove a redundant if. commit 0d74fbda1559edd4240a956ba4a232adf2c0c8c5 [revision 1607] Author: Fiona Glaser Date: Fri May 21 13:07:12 2010 -0700 Avoid a redundant qpel check in lookahead with subme <= 1. commit a38f372cfe3c508594f49f6a02d50ea9418a4c09 [revision 1606] Author: Anton Mitrofanov Date: Tue May 25 19:11:42 2010 +0400 Fix ABR rate control calculations Incorrect frame numbers were used, resulting in slightly inaccurate ratecontrol. commit 30a202d2b96b6737b399e82dc82f51bc694ae790 [revision 1605] Author: Anton Mitrofanov Date: Tue May 25 18:45:16 2010 +0400 Fix calculation of total bitrate printed after stop by CTRL+C commit 3e8068ab1c8b0fdc491950b1b50c2e5c3149a51e [revision 1604] Author: Kieran Kunhya Date: Sat May 22 14:32:53 2010 +0100 Fix typo in fake-interlaced documentation commit 318d9d659619617ba6a04835f15ef133c1ddaaf1 [revision 1603] Author: Fiona Glaser Date: Tue May 25 17:49:07 2010 -0700 Fix CABAC+PCM, regression in r1592 Changes to queue in CABAC didn't get propagated to PCM code. commit 5f9003633736b288265481c57fa779ac200b96a0 [revision 1602] Author: Henrik Gramner Date: Fri May 21 15:30:26 2010 +0200 Fix performance regression in r1582 Set the correct compiler flags. commit 2ea35adf96ab0bdb830692492f38c98caa28684d [revision 1601] Author: Fiona Glaser Date: Tue May 18 16:48:00 2010 -0700 Rewrite deblock strength calculation, add asm Rewrite is significantly slower, but is necessary to make asm possible. Similar concept to ffmpeg's deblock strength asm. Roughly one order of magnitude faster than C. Overall, with the asm, saves ~100-300 clocks in deblocking per MB. commit 4bebd7414cdc1a2a4c06004951686071c3a9b532 [revision 1600] Author: Anton Mitrofanov Date: Fri May 21 10:33:45 2010 +0400 Fix different output with differing sync-lookahead Also reduce memory consumption. commit a768a0e6a2063cab3ab287f73a600daf400d7ec1 [revision 1599] Author: Anton Mitrofanov Date: Tue May 18 22:26:59 2010 +0400 Mark Win32 executable as large address aware commit 2b61248f4bcbb5f9df32d940732bc26d8feeda8c [revision 1598] Author: Kieran Kunhya Date: Thu May 20 17:45:16 2010 +0100 Add "Fake interlaced" option This encodes all frames progressively yet flags the stream as interlaced. This makes it possible to encode valid 25p and 30p Blu-Ray streams. Also put the pulldown help section in a more appropriate place. commit 0a3b4aded728e162cb9a59befd0a3da3553bee7a [revision 1597] Author: Alex Jurkiewicz Date: Thu May 20 15:01:37 2010 +0800 Modify version.sh to output to stdout. Update configure to match. commit a1d7bbe4d270479c5905a39e995814ce80f72587 [revision 1596] Author: Henrik Gramner Date: Wed May 19 23:09:58 2010 +0200 Set correct filesystem permissions for various files commit 9a0c21a943cb669b4fc4d38044a0edfc9413291f [revision 1595] Author: Anton Mitrofanov Date: Wed May 19 21:07:03 2010 +0400 Fix regression in r1566 Intra stats need to be kept track of for fast intra decision. commit 1a08335a07adb4a60fc949a749b2dd71f5c11c02 [revision 1594] Author: Fiona Glaser Date: Tue May 18 11:53:32 2010 -0700 Fix rc-lookahead in encoding options SEI in 2-pass with VBV commit 047ae529404ac663f85380054d8e446c55e7c2af [revision 1593] Author: Loren Merritt Date: Mon May 17 14:08:37 2010 -0700 Reduce memory usage in 2-pass with b-adapt 2 commit 3267f35a63a05bad83e7c50df887984254346785 [revision 1592] Author: Fiona Glaser Date: Sat May 15 14:48:58 2010 -0700 Overhaul CABAC: faster, less cache usage Horribly munge up the CABAC tables to allow deduplication of some data. Saves 256 bytes of L1d cache in non-RD, 512 bytes in RD. Add asm versions of bypass and terminal; save L1i cache by re-using putbyte code. Further optimize encode_decision. All 3 primary CABAC functions fit in under 256 bytes of code total on x86_64. commit 52206369380c1b91d45fc8ee88f036b6e4fee5d5 [revision 1591] Author: Kieran Kunhya Date: Thu May 13 19:13:35 2010 +0100 Fix typo in pulldown commit 8939a416c0553f2c0d494d126044211007c742fb [revision 1590] Author: Anton Mitrofanov Date: Wed May 12 22:05:34 2010 +0400 Fix bitrate calculation in progress status Was slightly incorrect due to using pts, which is out of order. commit 53eda22e7d245f3f435903a64fd99d7ecce79ab1 [revision 1589] Author: Anton Mitrofanov Date: Wed May 12 01:57:38 2010 +0400 Fix crash with sliced-threads on Phenom commit af0c64f701d90cf9326185e954f3d85e25bfb338 [revision 1588] Author: Fiona Glaser Date: Mon May 10 22:59:12 2010 -0700 Fix condition for printing rc=cbr in options SEI Also fix crf-max formatting. commit 8c02c790353c3ef8ffd091567e04d4f73b8ad2f8 [revision 1587] Author: Henrik Gramner Date: Mon May 10 23:27:36 2010 +0200 Shrink even more constant arrays commit dfba665aa511ffa2fa17fbb9c71e980c4216accc [revision 1586] Author: Fiona Glaser Date: Sat May 8 12:07:13 2010 -0700 Add API function to trigger intra refresh Useful for interactive applications where the encoder knows that packet loss has occurred on the client. Full documentation is in x264.h. commit a7d75da6b4e7e4d57791a1b58abe164da61e6f00 [revision 1585] Author: Fiona Glaser Date: Sat May 8 11:58:22 2010 -0700 Fix intra refresh behavior with I-frames Intra refresh still allows I-frames (for scenecuts/etc). Now I-frames count as a full refresh, as opposed to instantly triggering a refresh. commit 54e784fdf410bf6dd7dd2312251fbe576a0d03fd [revision 1584] Author: Anton Mitrofanov Date: Thu May 6 10:03:31 2010 -0700 More cosmetics commit e997028964e4023552411176bce526c98c793d34 [revision 1583] Author: Fiona Glaser Date: Thu May 6 00:53:20 2010 -0700 Fix unresolved symbol in r1573 gnu ld didn't complain, but some other linkers did. commit c74934475b92b5dea2c48db8dd08c4ab0e93c31e [revision 1582] Author: Steven Walters Date: Wed May 5 19:54:04 2010 -0400 Remove unnecessary --enable options Change --enable-visualize to actually check for X11 support. commit 9ce2783458beaf3a66089a7c82ad0b5ede0c48bd [revision 1581] Author: Fiona Glaser Date: Mon May 3 21:27:16 2010 -0700 Don't force row QPs to integer values with VBV VBV should no longer raise the bitrate of the video. That is, at a given quality level or average bitrate, turning on VBV should only lower the bitrate. This isn't quite true if adaptive quant is off, but nobody should be doing that anyways. Also may result in slightly more accurate per-row VBV ratecontrol. commit 43564b799787749cf14a33a47e852d34de73758b [revision 1580] Author: James Darnley Date: Sun May 2 16:30:50 2010 -0700 Add field-order detection to y4m demuxer commit d7268f19b909566e94760bc49b01a5596c0b4ac6 [revision 1579] Author: Fiona Glaser Date: Sun May 2 11:45:15 2010 -0700 Fix sliced-threads + interlaced Broken in r1546. commit 94123d65e8c23e8fa05b138f9770e58d975b1cc0 [revision 1578] Author: Fiona Glaser Date: Sun May 2 11:41:36 2010 -0700 Improve temporal MV prediction Predict based on the results of p16x16 search, not final MVs. This lets us get predictions even if mode decision chose intra. Also improves cache coherency. commit 8399311e5bccb75d6c1327d3ee050c68eefe8c5c [revision 1577] Author: Fiona Glaser Date: Sat May 1 19:34:14 2010 -0700 More accurate MV prediction on edges in lookahead commit 15c02c2d10fcd532d873d08ac929d8f8cae694f9 [revision 1576] Author: Fiona Glaser Date: Sat May 1 19:32:01 2010 -0700 Error out on invalid input stride Might catch some crashes due to buggy calling applications. commit 68438826539ee3376e0469e16996a35e544176ef [revision 1575] Author: Fiona Glaser Date: Sat May 1 00:18:01 2010 -0700 Remove unnecessary debugging assert Shouldn't have been in r1568 to begin with. commit 795a64f1f26dee1bff676dd223f8c93a0a58e1fe [revision 1574] Author: Fiona Glaser Date: Fri Apr 30 13:45:50 2010 -0700 Shrink some more constant arrays commit 311c4bb16a49e7a37408c3e29a6d385883592f11 [revision 1573] Author: Fiona Glaser Date: Fri Apr 30 11:36:19 2010 -0700 Deduplicate asm constants, automate name prefixing Auto-prefix global constants with x264_ in cextern. Eliminate x264_ prefix from asm files; automate it in cglobal. Deduplicate asm constants wherever possible to save data cache (move them to a new const-a.asm). Remove x264_emms() entirely on non-x86 (don't even call an empty function). Add cextern_naked for a non-prefixed cextern (used in checkasm). commit cca478edc595d507d6486d548448802461a74547 [revision 1572] Author: Fiona Glaser Date: Fri Apr 30 09:57:55 2010 -0700 Shrink a few x86 asm functions Add a few more instructions to cut down on the use of the 4-byte addressing mode. commit c490e416499d275be462cbf9e071df4a9a5b7484 [revision 1571] Author: Fiona Glaser Date: Thu Apr 29 19:53:59 2010 -0700 Make options SEI use weight* instead of wpred* More intuitive and maps more reasonably to the CLI options. Breaks statsfile backwards-compatibility. commit 6d12fae91a5faa4f82917f5caaed4ddad39ac591 [revision 1570] Author: Loren Merritt Date: Thu Apr 29 17:35:25 2010 +0000 r1548 broke subme < 3 + p8x8/b8x8 Caused significantly worse compression. Preset-wise, only affected veryfast. Fixed by not modifying mvc in-place. commit 13922ab880162530b1acee4dfddfd046dbdeb0f3 [revision 1569] Author: Henrik Gramner Date: Tue Apr 27 01:44:33 2010 +0200 More write-combining commit a40aa64dadb89d371671d49419f3b763302925f5 [revision 1568] Author: Fiona Glaser Date: Mon Apr 26 15:10:11 2010 -0700 Reduce lookahead memory usage, cache misses Merge lowres_types with lowres_costs. commit a6410b8c28645326c332857fc47d985b9031617c [revision 1567] Author: Fiona Glaser Date: Sun Apr 25 14:54:29 2010 -0700 Fix build on x86 with asm on but SSE off commit 22acdd610c9d9bdda31295c388a4d59d93b5d704 [revision 1566] Author: Fiona Glaser Date: Sat Apr 24 13:55:51 2010 -0700 Don't calculate ref/partition stats if not necessary commit 7d38392b3818f056088ce4f475626bdd2be018f4 [revision 1565] Author: Fiona Glaser Date: Sat Apr 24 13:07:18 2010 -0700 Split out MV prediction into mvpred.c Make common/macroblock.c a bit less gigantic. commit 8a8d72fee877e32a300419114e02038ddb993d46 [revision 1564] Author: Loren Merritt Date: Sat Apr 24 16:22:14 2010 +0000 Fix mv predictor clipping on non-x86 (regression in r1548) commit 2788cdf638060ebe021a2d33d72ea0b86608bedd [revision 1563] Author: Anton Mitrofanov Date: Sat Apr 24 00:26:13 2010 +0400 Move getopt.c to x264cli sources from libx264 Only affects builds on systems without getopt.c. commit 09f97ee9f910ef157e5186bd3ad82e7818cda144 [revision 1562] Author: Fiona Glaser Date: Thu Apr 22 12:53:07 2010 -0700 Move deblocking code to a separate file Should clean up frame.c a bit. commit b3005ee3fe778d4eade4d472ee9550120040caee [revision 1561] Author: Steven Walters Date: Tue Apr 20 19:48:02 2010 -0400 fix ffms demuxer to support input timebase values > 2^31 commit a7e037971f777b107583c75af335067f3fd813e3 [revision 1560] Author: Fiona Glaser Date: Tue Apr 20 16:53:06 2010 -0700 Fix 10l in cache_load changes Broke constrained intra pred, probably not anything else. commit e2f0f1816c8e930800270b0cb2198416700761c1 [revision 1559] Author: Fiona Glaser Date: Tue Apr 20 16:50:13 2010 -0700 Faster fullpel predictor checking Also shave a few instructions off dia/hex motion estimation loops. commit 8d9fe0220794bb35a0e2b17ff9f0c0660b781bcb [revision 1558] Author: Loren Merritt Date: Tue Apr 20 09:40:49 2010 +0000 Fix checkasm's generation of deblock inputs (regression in r1517) commit e091a5e32baed410d79f871667d5f28a4fdc5a35 [revision 1557] Author: Loren Merritt Date: Tue Apr 20 09:17:18 2010 +0000 Fix printing of bitrate when timestamps aren't available Doesn't affect x264cli, but was broken in some other apps in CFR mode. commit 21f1a3c438a8404f61f3f1f1e5270d3d7beaff9d [revision 1556] Author: Fiona Glaser Date: Tue Apr 20 00:46:29 2010 -0700 Don't check mv0 twice One less SAD in motion estimation. Also rename bmv -> pmv; more accurate naming. commit 564cfb8a1173fe1e037c51e76af36e5e75fddfba [revision 1555] Author: Fiona Glaser Date: Mon Apr 19 11:02:27 2010 -0700 Remove reordering restrictions from weightp Apparently the spec does allow two consecutive copies of the same frame in the reference list. This involves an incredibly ugly hack to wrap around the frame number. Very slight compression improvement. commit df275d503348cce71c110e278f2f866e0ee87f5e [revision 1554] Author: Fiona Glaser Date: Mon Apr 19 23:34:03 2010 -0700 Print intra chroma pred modes in stats commit e3c766fcc2edcdc0d753888a95aab778d9c07769 [revision 1553] Author: Fiona Glaser Date: Sun Apr 18 22:54:48 2010 -0700 Add mv0 special case in pskip chroma MC Significantly faster pskip MC. commit f25f234555462fcd284bde0d70744ed8d774968c [revision 1552] Author: Francois Cartegnie Date: Sun Apr 18 13:04:59 2010 -0700 Fix build scripts to work with non-GNU tools commit 641a8d543d64c68fe7e1e2dd0e0ca966a4795855 [revision 1551] Author: Fiona Glaser Date: Fri Apr 16 20:04:13 2010 -0700 Faster deblock reference frame checks Use a lookup table to simplify logic commit 4e105e079314b2fe04742d5605ffb0d961c16813 [revision 1550] Author: Henrik Gramner Date: Fri Apr 16 22:39:45 2010 +0200 Faster chroma CBP handling commit d48c3809d24e8cc7caff2c39ae1544a957452787 [revision 1549] Author: Fiona Glaser Date: Fri Apr 16 11:36:43 2010 -0700 Fix issues with extremely large timebases With timebase denominators >= 2^30 , x264 would silently overflow and cause odd issues. Now x264 will explicitly fail with timebase denominators >= 2^31 and work with timebase denominators 2^31 > x >= 2^30. commit bb1294f18ad8dd938532cb4247c8b207726874ad [revision 1548] Author: Fiona Glaser Date: Fri Apr 16 12:06:07 2010 -0700 MMX code for predictor rounding/clipping Faster predictor checking at subme < 3. commit c1fb471c16332f93b71327c1783eacffb53548ec [revision 1547] Author: Fiona Glaser Date: Fri Apr 16 03:06:46 2010 -0700 Fix four minor bugs found by Clang commit 60b158144c942016db5ae6adfa3040bd395e4006 [revision 1546] Author: Fiona Glaser Date: Thu Apr 15 16:32:31 2010 -0700 Move deblocking/hpel into sliced threads Instead of doing both as a separate pass, do them during the main encode. This requires disabling deblocking between slices (disable_deblock_idc == 2). Overall performance gain is about 11% on --preset superfast with sliced threads. Doesn't reduce the amount of actual computation done: only better parallelizes it. commit 9df61bcc12b3c28e4cd743a2a789ef2f197fc1aa [revision 1545] Author: Fiona Glaser Date: Wed Apr 14 14:43:25 2010 -0700 Prefetch MB data in cache_load Dramatically reduces L1 cache misses. ~10% faster cache_load. commit 72f79049c1c34ea5feb41a05f26c42f65451b681 [revision 1544] Author: Fiona Glaser Date: Fri Apr 23 19:09:37 2010 +0000 Fix a ton of pessimization caused by aliasing in cache_save and cache_load commit f80446e889e5fb1734bc462115303593f3b093f3 [revision 1543] Author: Fiona Glaser Date: Fri Apr 23 19:09:18 2010 +0000 Add CP128/M128 macros using SSE commit ef7036991ff50eed268b924e3f669c5e1afb7f92 [revision 1542] Author: Fiona Glaser Date: Sun Apr 11 13:36:50 2010 -0700 Fix various early terminations with slices Neighbouring type values (type_top, etc) are now loaded even if the MB isn't available for prediction. Significant overall performance increase (as high as 5-10%+) with lots of slices (e.g. with slice-max-size). commit 25047b4042b18bfd7ef7d40fd48e904852da1ada [revision 1541] Author: Anton Mitrofanov Date: Tue Apr 13 21:25:42 2010 +0400 Enable --fast-pskip on fast firstpass commit 2d3b31f574f2d1cf80a51cf2af7720ed30cd10b3 [revision 1540] Author: Steven Walters Date: Tue Apr 13 08:44:37 2010 -0400 Make interlaced detection in avisynth only apply to field-based input Fixes improper flagging of progressive sources. commit e4289459eae03c18733f617012b67cd00e31b6ab [revision 1539] Author: Anton Mitrofanov Date: Tue Apr 13 19:55:12 2010 +0400 Set psy=0 in lossless mode Doesn't actually affect output, just what's written in the SEI. commit 5c88af35b79bd59d9e12e7a7761fc0e29f9075c4 [revision 1538] Author: Loren Merritt Date: Sun Apr 11 04:20:04 2010 +0000 Fix a use of sad_x4 that had non-mod64 stride Minimal speed improvement, but fixes a violation of internal api. commit 8e098f8e53de9801f2c1b382992736ffbc1e74a6 [revision 1537] Author: Fiona Glaser Date: Sat Apr 10 13:15:30 2010 -0700 Make keyint_min auto by default Gives more reasonable default settings when using short GOPs. commit 04f73bedf6a61099de58b0e03c02dc4731768884 [revision 1536] Author: Fiona Glaser Date: Sat Apr 10 00:49:19 2010 -0700 Faster mv predictor checking at subme < 3 Simplify the predicted MV cost check. commit cec7764a9a3749d6f67ea25af3082178e4d70d34 [revision 1535] Author: Fiona Glaser Date: Sat Apr 10 00:35:50 2010 -0700 Special case in qpel refine for subme=1 ~15-20% faster qpel refine with subme=1. Some minor cleanups in refine_supel. commit 9f053b5c5c59fca4c40ec7914c95b36d022c2887 [revision 1534] Author: Henrik Gramner Date: Sat Apr 10 02:21:01 2010 +0200 Cosmetics: VLC tables commit 134e221530d246e78f986e14d1f6d25a52bb3836 [revision 1533] Author: Fiona Glaser Date: Fri Apr 9 18:13:22 2010 -0700 Add faster mv0 special case for macroblock-tree Improves performance on low-motion video. commit 2907fc6cc96dc4c8e5d8ac99553e2031c0c1b0ba [revision 1532] Author: Fiona Glaser Date: Fri Apr 9 01:49:55 2010 -0700 Add miscompilation check for x264_clz Running a Phenom-optimized build of x264 (e.g. -march=amdfam10) on a non-Phenom CPU didn't SIGILL; instead it would silently produce incorrect output. Now, instead, it will error out loudly. commit d037de38df77e2594ed91f80e6ddd4e70e746e4a [revision 1531] Author: Anton Mitrofanov Date: Wed Apr 7 12:17:20 2010 +0300 Fixing floating-point exception in level-checking Doesn't cause any issues for x264cli, but might impact some calling apps that care (e.g. Delphi apps). commit 29820105cf31f3bc399e82450a2bf18944026f88 [revision 1530] Author: Fiona Glaser Date: Thu Apr 8 18:44:16 2010 -0700 Save a few bits in multislice encoding Set the initial QP for each slice to the last QP of the previous slice. commit 788b8b7e5ef458fc2c312f72415740807f43cf99 [revision 1529] Author: Alex Wright Date: Thu Apr 8 01:25:55 2010 +1000 Early termination in 16x8/8x16 search Combine the actual cost of the first partition with the predicted cost of the second to avoid searching the second when possible. Reduces the number of times the second partition is searched by up to ~75% in non-RD mode, ~10% in RD mode. Negligible effect on compression. commit 049b662b98e80bffa5e21f771f396559a13c3ced [revision 1528] Author: Fiona Glaser Date: Wed Apr 7 07:45:00 2010 -0700 Make MV prediction work across slice boundaries Should improve motion search with lots of small slices, e.g. with slice-max-size. Still restricted by sliced threads (won't cross the boundary between two threadslices). The output-changing part of the previous patch. commit 95df880ca172e995ea0d3bdd76544f8f84db7a64 [revision 1527] Author: Fiona Glaser Date: Wed Apr 7 07:43:46 2010 -0700 Cleanup and simplification of macroblock_load Doesn't do anything now, but will be useful for many future changes. Splitting out neighbour calculation will make MBAFF implementation easier. Calculation of neighbour_frame value (actual neighbouring MBs, ignoring slices) will be useful for some future patches. commit 459473b212a21aa280b7dd0c355ae73847a988a4 [revision 1526] Author: Fiona Glaser Date: Wed Apr 7 03:10:03 2010 -0700 Add missing #include to display-x11.c commit bc7d6c3b758f7cf828ada74cac9a05435d8425ef [revision 1525] Author: Steven Walters Date: Tue Apr 6 22:08:21 2010 -0400 Add TFF/BFF detection to all demuxers Fix interlaced Avisynth input, automatically weave field-based input. commit df902b5b4b672016b03eb650618ff6bd3e188c96 [revision 1524] Author: Fiona Glaser Date: Tue Apr 6 13:53:22 2010 -0700 Correctly mark output frames as BREF Simplify pic_out code. commit e9726b63b92d9b704a9e8cbf9665ec0621ada5bb [revision 1523] Author: Kieran Kunhya Date: Sat Apr 3 14:59:59 2010 -0700 Fix HRD compliance As usual, the spec is so insanely obfuscated that it's impossible to get things right the first time. commit de9e381d23fa7574003502a514bed3b624a6e41b [revision 1522] Author: Alex Wright Date: Sat Apr 3 14:50:26 2010 -0700 Better b16x8/8x16 early termination in B-frames A bit slower but up to 1-2% better compression. commit 43d3e08fd1b1cd481acc8944d8f685f9fb383387 [revision 1521] Author: Fiona Glaser Date: Fri Apr 2 12:23:52 2010 -0700 Fix 10L in B-skip improvement patch commit 4d92f3f1cc263695debcdc4c8fa5016504225ad3 [revision 1520] Author: Fiona Glaser Date: Fri Apr 2 03:09:48 2010 -0700 Fix printing of SEI header with VBV + ABR SEI header shouldn't say CBR unless bitrate == maxrate. commit 3a6946754b5d14914132aae2971c8318078672d2 [revision 1519] Author: Fiona Glaser Date: Thu Apr 1 22:33:42 2010 -0700 Simplify slicetype_frame_cost Avoid redundant calculations when VBV is on (due to the intra-only call). Move most of the logic into per-MB code. commit 68cae61a9f484274594eeb264355f9c364f317c5 [revision 1518] Author: Fiona Glaser Date: Thu Apr 1 15:51:59 2010 -0700 Faster CABAC state copying for small partitions Save ~25 clocks per i4x4, i8x8, and sub8x8 RD call. commit 58d2349dd7aad34a2cf09be081670d510657eda1 [revision 1517] Author: Fiona Glaser Date: Wed Mar 31 01:44:07 2010 -0700 Massive cosmetic and syntax cleanup Convert all applicable loops to use C99 loop index syntax. Clean up most inconsistent syntax in ratecontrol.c, visualize, ppc, etc. Replace log(x)/log(2) constructs with log2, and similar with log10. Fix all -Wshadow violations. Fix visualize support. commit 3b31b6cd2ed6e368970171c5a36d66dcfc0917dd [revision 1516] Author: Fiona Glaser Date: Tue Mar 30 23:30:09 2010 -0700 Fix array overread in b8x16 search commit d45ad67fd03c7bce60bc06d4cae074549a34b6c7 [revision 1515] Author: Fiona Glaser Date: Mon Mar 29 19:03:13 2010 -0700 Faster direct check with subpartitions off Also simplify the whole function a bit. commit 30fda434119c30f70b7b5124eb811b52a85cf768 [revision 1514] Author: Fiona Glaser Date: Mon Mar 29 02:14:25 2010 -0700 Print crf-max with appropriate precision in SEI commit dab0ee2f4f67363511344ba9ba134cf32373f9d7 [revision 1513] Author: Yusuke Nakamura Date: Mon Mar 29 00:05:30 2010 -0700 Fix 10l in timecode seeking commit 549b115a89c33db9776a39df5351f7a241877314 [revision 1512] Author: Yusuke Nakamura Date: Mon Mar 29 13:51:02 2010 +0900 Fix 10L: Remove needless error check This error check was for cfr input + --timebase, but that doesn't happen, and brings about a bug with vfr input. commit 9fbcc12abe4f78c0f0d9ba44813b97528d9532db [revision 1511] Author: Fiona Glaser Date: Sun Mar 28 20:40:42 2010 -0700 Don't use 2 L1 refs with pyramid + ref=1 Slightly faster encoding with ref=1. commit d427ae20edba2b1509ceb9b5dea39ec33ee7b1e8 [revision 1510] Author: Henrik Gramner Date: Fri Mar 26 17:57:23 2010 -0700 Update copyright year in SEI header commit 0b720fee5b6adaf99c1b37c90af8e4023405d224 [revision 1509] Author: Fiona Glaser Date: Fri Mar 26 15:33:20 2010 -0700 New "superfast" preset, much faster intra analysis Especially at the fastest settings, intra analysis was taking up the majority of MB analysis time. This patch takes a ton more shortcuts at the fastest encoding settings, decreasing compression 0.5-5% but improving speed greatly. Also rearrange the fastest presets a bit: now we have ultrafast, superfast, veryfast, faster. superfast is the old veryfast (but much faster due to this patch). veryfast is between the old veryfast and faster. faster is the same as before except with MB-tree on. Encoding with subme >= 5 should be unaffected by this patch. commit 4805079dfe3173802e06630fa27841d57aed5952 [revision 1508] Author: Fiona Glaser Date: Thu Mar 25 14:46:24 2010 -0700 Avoid redundant MV prediction in duplicate refs commit 54e09223021a67bc173efc9e91b02d5ccf81d188 [revision 1507] Author: Henrik Gramner Date: Wed Mar 24 23:27:30 2010 +0100 Cosmetics in mvd handling Use a 2D array instead of doing manual pointer arithmetic. commit de8f0ac83809fc127d3ed63abe6b2392698eea68 [revision 1506] Author: Fiona Glaser Date: Wed Mar 24 07:25:01 2010 -0700 Fix make uninstall on systems with executable suffixes commit aad4437600e6f9945a42c024e11de4bf5a785a06 [revision 1505] Author: Fiona Glaser Date: Tue Mar 23 14:00:58 2010 -0700 Add tune for still image compression There has been some demand for this from companies looking to use x264 for still image compression (it can outperform JPEG or JPEG-2000 by a factor of 2 or more). Still image compression is a bit different; because temporal stability isn't an issue, we can get away with far more powerful psy settings. commit 774dbb4795638f4b8ead6a77bc045584223f4d03 [revision 1504] Author: Henrik Gramner Date: Mon Mar 22 02:59:50 2010 +0100 Pad non-mod16 resolutions using the correct field Improves compression of interlaced videos with non-mod16 heights. commit e4404fa3f491b8bfad496b300d065569e5a292bc [revision 1503] Author: Fiona Glaser Date: Sun Mar 21 09:10:00 2010 -0700 Document slow/fast firstpass in --fullhelp commit 084adc2e54f78ecc0bb95966a2b179756c25a71e [revision 1502] Author: Holger Lubitz Date: Sat Mar 20 20:41:21 2010 +0100 Fix some misattributions in profiling Cycles spent in load_hadamard and the avg2 w16 ssse3 cacheline split code were misattributed. commit e77bbb6af56d2c7ff2f184e6cfdcac6f2328ccfa [revision 1501] Author: Fiona Glaser Date: Sat Mar 20 17:07:12 2010 -0700 Much faster non-RD intra analysis Since every pred mode costs at least 1 bit, move that part into the initial SATD cost. This lets i4x4/i8x8 analysis terminate earlier. If the cost of the predicted mode is less than the cost of signalling any other mode, early-terminate the analysis. commit d8d83a9624744b4fc79cf71d31ef32c2678c4dae [revision 1500] Author: Fiona Glaser Date: Wed Mar 17 15:53:43 2010 -0700 Fix stack alignment in sliced threads Could cause crashes when called from non-GCC-compiled applications. commit 18eed0b9ee1314cc3ba9d16c0e44401f62aba624 [revision 1499] Author: Henrik Gramner Date: Tue Mar 16 01:46:00 2010 +0100 Cosmetics: use sizeof() where appropriate commit 137e233f39438654d0c7d17c8e723a8eecc02128 [revision 1498] Author: Fiona Glaser Date: Mon Mar 15 00:01:57 2010 -0700 Split up analyse_init Save some time by avoiding some unnecessary inits and moving other parts to per-thread init. commit 7a282a5892454f441d94e64e0a41c617472fa798 [revision 1497] Author: Henrik Gramner Date: Mon Mar 15 01:19:45 2010 +0100 Reduce stack usage of b-adapt 2's trellis Also remove some redundant code. commit 37b4707b7d868206ca2b35ac85c0fc7a7848838e [revision 1496] Author: Fiona Glaser Date: Sun Mar 14 00:25:02 2010 -0800 Various motion estimation optimizations Faster method of checking MV range. Predict MVs and cache MVs/MVDs for bidir qpel-RD. A whole bunch of other minor optimizations. Slightly better performance and compression. commit 4c03ec69fc91c60ff250d25fe805d1d5105c5fcf [revision 1495] Author: Fiona Glaser Date: Sun Mar 14 00:19:59 2010 -0800 Overhaul macroblock_cache_rect Unify the rectangle functions into a single one similar to ffmpeg's fill_rectangle. Remove all cases of variable-size cache_rect calls; create a function-pointer-based system for handling such cases. Should greatly decrease code size required for such calls. commit 8b4cca0e41f39748bb45c5cf88231d052df4e8cf [revision 1494] Author: Fiona Glaser Date: Sun Mar 14 16:48:22 2010 -0700 Make a bunch of small functions ALWAYS_INLINE Probably no real effect for now, but needed for the next patch. commit 219505afc89c0bec136a65e68cb9fdfca6d9bf85 [revision 1493] Author: Loic Minier Date: Wed Mar 10 05:26:46 2010 -0800 Two compatibility fixes Add IA64 support in configure. commit 6f3a6d52e605acc9df8277acb5c7094190898d82 [revision 1492] Author: Henrik Gramner Date: Fri Mar 5 03:19:47 2010 +0100 Faster x264_macroblock_encode_pskip GCC is apparently unable to optimize out the calculation of a variable when it isn't used. commit 47092e82824ac0fb7f2ee370762feec2ae6d2a0a [revision 1491] Author: Fiona Glaser Date: Sun Mar 7 04:10:30 2010 -0800 Much more accurate B-skip detection at 2 < subme < 7 Use the same method that x264 uses for P-skip detection. This significantly improves quality (1-6%), but at a significant speed cost as well (5-20%). It also may have a very positive visual effect in cases where the inaccurate skip detection resulted in slightly-off vectors in B-frames. This could cause slight blurring or non-smooth motion in low-complexity frames at high quantizers. Not all instances of this problem are solved: the only universal solution is non-locally-optimal mode decision, which x264 does not currently have. subme >= 7 or <= 2 are unaffected. commit 639b18a6f9904039e46376c55ad60e24d8617ab6 [revision 1490] Author: Alexander Strange Date: Sun Mar 7 02:57:04 2010 -0500 Reformat profile restrictions in --fullhelp. Put "no interlaced", "no lossless" on their own line to avoid them running into the default options list. commit a9adb0d4ad942f4d0cf99750fbc124b173ba0a38 [revision 1489] Author: James Darnley Date: Sat Mar 6 18:28:07 2010 -0800 Fix typo in configure commit fea8f42ebe6141272cda8dd2112ba5517432b1f6 [revision 1488] Author: David Conrad Date: Sat Mar 6 10:29:57 2010 -0800 Add support for spaces to iPhone GAS preprocessor script commit 6ac9e171a44790f312b3cd0ae77b5213f04e16ba [revision 1487] Author: Yusuke Nakamura Date: Sat Mar 6 19:25:30 2010 +0900 Fix slightly wrong mp4 duration. commit ddfe41245f68771f183a3b5caa740e3aa3adce79 [revision 1486] Author: Yusuke Nakamura Date: Sat Mar 6 19:24:32 2010 +0900 Fix link errors with newest gpac cvs gpac decided to randomly break API and require us to use their own custom malloc and free. commit 2a2db86dc2bad14e13b7568ee212435cd4e5f059 [revision 1485] Author: Kieran Kunhya Date: Fri Mar 5 20:43:02 2010 +0000 Save a few bits in slice headers Don't override the maximum ref index in the slice header if it's the same as the default. Also update the naming of the relevant variables in the PPS. commit 415aac4ff746909ea45d5afe94ba256979e647bd [revision 1484] Author: Fiona Glaser Date: Thu Mar 4 09:59:09 2010 -0800 Shrink some arrays in x264_t Also remove an unnecessary assignment from cache_load. commit 30eb4abce119fe02304f16b0712a399e7e125c1d [revision 1483] Author: Fiona Glaser Date: Wed Mar 3 11:22:29 2010 -0800 Use x264_log in more places instead of fprintf commit 89183a0e32256b94c0755eaf0d494a860fa0ef08 [revision 1482] Author: Anton Mitrofanov Date: Wed Mar 3 10:14:20 2010 -0800 Fix two nondeterminisms Move noise reduction data into thread-specific data. Use correct reference list for L1 temporal predictors. commit 7ff23daa52db92d7fcc4633e8ad21f4f6a9107a5 [revision 1481] Author: Fiona Glaser Date: Fri Mar 19 14:44:10 2010 -0700 "CRF-max" support with VBV This is a rather curious feature that may have more use than is initially obvious. In CRF mode with VBV enabled, CRF-max allows the user to specify a quality level which the encoder will never go below, even due to the effects of VBV. This is not the same as qpmax, which is not aware of issues like scene complexity. Setting this WILL cause VBV underflows in any situation where the encoder would have needed to exceed the relevant CRF to avoid underflow. Why might one want to do this even if it would cause VBV underflows? In the case of streaming, particularly ultra-low-latency streaming, it may be preferable to drop frames than to display frames that are of too low a quality. Thus, in extremely complex scenes, rather than display completely awful video, the streaming server could simply drop to a lower framerate. Scenecuts, which normally look terrible under situations like single-frame VBV, could be handled by just displaying them a bit later and dropping frames to compensate. In other words, it's better to see the scenecut 150ms delayed than for it to look like a blocky mess for 150ms. On the caller-side, this would be handled by detecting the output size of x264's frames and dropping future frames to compensate if necessary. This can also be used in normal encoding simply to ensure that VBV does not hurt quality too much (at the cost of potentially causing underflows). This can help quite a lot when using single-frame VBV and sliced threads, where VBV can often be somewhat unstable. commit bb9b16b4722a1273885367f13f448516efe47ed1 [revision 1480] Author: Kieran Kunhya Date: Tue Mar 2 00:57:10 2010 -0800 Blu-ray support: NAL-HRD, VFR ratecontrol, filler, pulldown x264 can now generate Blu-ray-compliant streams for authoring Blu-ray Discs! Compliance tested using Sony BD-ROM Verifier 1.21. Thanks to The Criterion Collection for sponsoring compliance testing! An example command, using constant quality mode, for 1080p24 content: x264 --crf 16 --preset veryslow --tune film --weightp 0 --bframes 3 --nal-hrd vbr --vbv-maxrate 40000 --vbv-bufsize 30000 --level 4.1 --keyint 24 --b-pyramid strict --slices 4 --aud --colorprim "bt709" --transfer "bt709" --colormatrix "bt709" --sar 1:1 -o This command is much more complicated than usual due to the very complicated restrictions the Blu-ray spec has. Most options after "tune" are required by the spec. --weightp 0 is not, but there are known bugged Blu-ray player chipsets (Mediatek, notably) that will decode video with --weightp 1 or 2 incorrectly. Furthermore, note the Blu-ray spec has very strict limitations on allowed resolution/fps combinations. Examples include 1080p @ 24000/1001fps (NTSC FILM) and 720p @ 60000/1001fps. Detailed features introduced in this patch: Full NAL-HRD compliance, with both VBR (no filler) and CBR (filler) modes. Can be enabled with --nal-hrd vbr/cbr. libx264 now returns HRD timing information to the caller in the form of an x264_hrd_t. x264cli doesn't currently use it, but this information is critical for compliant TS muxing. Full VFR ratecontrol support: VBV, 1-pass ABR, and 2-pass modes. This means that, even without knowing the average framerate, x264 can achieve a correct bitrate in target bitrate modes. Note that this changes the statsfile format; first pass encodes make before this patch will have to be re-run. Pulldown support: libx264 allows the calling application to specify a pulldown mode for each frame. This is similar to the way that RFFs (Repeat Field Flags) work in MPEG-2. Note that libx264 does not modify timestamps: it assumes the calling application has set timestamps correctly for pulldown! x264cli contains an example implementation of caller-side pulldown code. Pic_struct support: necessary for pulldown and allows interlaced signalling. Also signal TFF vs BFF with delta_poc_bottom: should significantly improve interlaced compression. --tff and --bff should be preferred to the old --interlaced in order to tell x264 what field order to use. Huge thanks to Alex Giladi and Lamont Alston for their work on code that eventually became part of this patch. commit 4d3c4787622d44eef8b813bc4324531546bd8aa5 [revision 1479] Author: Yusuke Nakamura Date: Sun Feb 28 21:42:19 2010 -0800 Timecode input/output --tcfile-in allows a user to specify a timecode v1 or v2 file to override input timestamps. Useful for dealing with VFR input, especially when FFMS/LAVF support isn't available. --tcfile-out writes a timecode v2 file containing the timecodes of the output file. New --timebase option allows a user to change the stream timebase. Intended primarily for forcing timebase with timecode files if necessary. When using --seek, note that x264 will seek in the timecode file as well. commit 1f9393ebe9efbae2da2a70a61aad35d270bb15f6 [revision 1478] Author: Alex Wright Date: Sun Feb 28 01:29:15 2010 -0800 Mixed-refs support for B-frames Small speed cost, usually a few percent at most. Generally has lowest cost in cases when it isn't very useful. Up to ~2% better compression overall on highly complex sources. Also fix a few minor bugs in B-frame analysis and various bits of cleanup. commit a934f0fa4763f57e820b7c9b2cfcbc8c00447ba1 [revision 1477] Author: Henrik Gramner Date: Mon Mar 1 22:01:04 2010 +0100 Faster rounding of chroma DC coefficients commit 9d71ff19a5002e2cd376716bfc25623ff80cd30b [revision 1476] Author: Holger Lubitz Date: Wed Mar 24 00:54:39 2010 +0100 Faster cabac_encode_decision_asm Minimizes instruction count, which also means smaller code. Various other slight changes to allow more instruction level parallelism. commit 125b8f6c36ba4d6523add3b1815aaef95e6e95e6 [revision 1475] Author: Holger Lubitz Date: Tue Mar 23 23:13:54 2010 +0100 Faster hpel_filter On ssse3, use pmaddubsw for h filter too (similar to v filter). Change 32-bit v and c filters to write the result non-temporal. Add commented-out defines to disable non-temporal operation. Hardly any black magic here, but still a measurable win especially for ssse3. commit 6c927227611840ce1e9fe6345fe9dc8cbcff039e [revision 1474] Author: David Conrad Date: Sun Feb 28 20:34:09 2010 -0500 Ignore XYSCSS in y4m if the newer standard C tag is present Apparently y4mscaler will generate 4:2:0 files with XYSCSS set to 444 commit b1e607e1a26498232722e65449f2b2079f3cb5d1 [revision 1473] Author: Fiona Glaser Date: Tue Mar 2 10:51:15 2010 -0800 Fix regression in r1450 I_PCM blocks would cause x264 to crash or generate bad output. Simplify PCM handling. commit 5f77d1c4a03e7f3ca5a606ab07d1f3901b6d98c5 [revision 1472] Author: Fiona Glaser Date: Sat Feb 27 14:26:02 2010 -0800 Fix crash with intra-refresh + aq-mode 0 commit 8ed734467e63bc9f528eabbfd8d58c7d7adec509 [revision 1471] Author: Fiona Glaser Date: Fri Feb 26 05:04:48 2010 -0800 Fix regression in r1453 r1453 broke psy-trellis with --trellis 2 commit 14faac1eaf0760edd2332cf2aaa53311d12df061 [revision 1470] Author: Fiona Glaser Date: Thu Feb 25 02:07:48 2010 -0800 Fix regression in r1449 Incorrectly placed thread MV check could result in rare thread MV internal errors, esp. with --non-deterministic. These weren't fatal errors (x264 could recover and continue with slight compression loss). commit 269b36dbd1e6355f750ef66894423a1189597ef9 [revision 1469] Author: Fiona Glaser Date: Wed Feb 24 20:51:43 2010 -0800 Cut size of MVD arrays by a factor of 2 again Only store the MVDs of the edges of each MB. Thanks to Michael Niedermayer for the idea. commit 80949af24bee8d655c28874fea686174cc027678 [revision 1468] Author: David Conrad Date: Wed Feb 24 19:39:57 2010 -0500 Disable Altivec and VIS optimizations when --disable-asm is specified commit 9eb6ec9f017c49a7d6979c72ce0d65a0fc104f0f [revision 1467] Author: Loren Merritt Date: Tue Feb 23 23:50:23 2010 -0800 Fix a buffer overread on odd input resolutions commit 89aa4e87032a47562fae19f1ad0fbb3fe6db0ab9 [revision 1466] Author: Fiona Glaser Date: Wed Feb 24 03:49:32 2010 -0800 Fix one bug, one corner case in VBV qp_novbv wasn't set correctly for B-frames. Disable ABR code for frames with zero complexity. Disable ABR code for CBR mode; it is completely unnecessary and can have negative consequences. commit daa1342e3faac6949cb87f5d0bd4ed42c1fa572f [revision 1465] Author: David Conrad Date: Wed Feb 24 00:29:21 2010 -0500 Port Mans Rullgard's NEON intra prediction functions from ffmpeg commit 4aa5679d89f887a338adf380d46a11c53a3d9f39 [revision 1464] Author: Fiona Glaser Date: Tue Feb 23 13:52:15 2010 -0800 Remove unused function Two other minor fixes. commit 232c5e278047da45d21f1961b5eeaf848a51be23 [revision 1463] Author: Fiona Glaser Date: Tue Feb 23 10:00:41 2010 -0800 Use short startcode in more possible situations Previous patch didn't cover all possible uses according to B.1.2. commit 714692cc27d8458da1f7192b066aefc93292fdc3 [revision 1462] Author: Fiona Glaser Date: Tue Feb 23 09:50:12 2010 -0800 Fix fastfirstpass Apparently the libx264 preset changes made "fastfirstpass" into "fastsecondpass" inadvertantly. commit 7fbc84b8d3a2e57ce04586459e20cef0f566f43b [revision 1461] Author: Anton Mitrofanov Date: Tue Feb 23 09:10:26 2010 -0800 Fix various silly errors in the previous patches commit 1c14dca59a533b7b39d3ef2734683dfc69a10c25 [revision 1460] Author: Fiona Glaser Date: Tue Feb 23 02:18:07 2010 -0800 Actually error out if preset/tune/profile is invalid Got lost somewhere in the move to libx264-based presets. commit d43e46cf5b6a2730a6c216f9a584c1c7bb32d868 [revision 1459] Author: Fiona Glaser Date: Mon Feb 22 17:33:17 2010 -0800 Faster probe_skip, 2x2 DC transform handling Move the 2x2 DC DCT into the dct_dc asm function to avoid some store-to-load forwarding penalties and extra register loads. Use dct_dc as part of the early termination in probe_skip. x86 asm partially by Holger Lubitz. ARM NEON asm by David Conrad. commit e6928564728fb467269b3a8f24f0c90d0b536630 [revision 1458] Author: Fiona Glaser Date: Sun Feb 21 17:30:52 2010 -0800 Use short startcodes whenever possible Saves one byte per frame for every slice beyond the first. Only applies to Annex-B output mode. commit c9e8e4de68825662580d99d0cae6989455698c2c [revision 1457] Author: Anton Mitrofanov Date: Sun Feb 21 13:21:11 2010 -0800 New algorithm for AQ mode 2 Combines the auto-ness of AQ2 with a new var^0.25 instead of log(var) formula. Works better with MB-tree than the old AQ mode 2 and should give higher SSIM. commit a4651264360e21d903214018f9ac24e0b503fa29 [revision 1456] Author: Fiona Glaser Date: Sun Feb 21 13:20:19 2010 -0800 Abide by the MinCR level limit Some Blu-ray analyzers were complaining about this. commit 76a8276f19ca5b01b3d54858cfc95ddc20fb2a71 [revision 1455] Author: Fiona Glaser Date: Sun Feb 21 03:56:06 2010 -0800 Make b-pyramid normal the default Now that b-pyramid works with MB-tree and is spec compliant, there's no real reason not to make it default. Improves compression 0-5% depending on the video. Also allow 0/1/2 to be used as aliases for none/strict/normal (for conciseness). commit 3e411be2a3132db8672cd6b2a33c159cfed79fb8 [revision 1454] Author: Fiona Glaser Date: Sun Feb 21 01:56:12 2010 -0800 Move presets, tunings, and profiles into libx264 Now any application calling libx264 can use them. Full documentation and guidelines for usage are included in x264.h. commit 5e8645b3a53860b03838cc4a60682bceb91e919c [revision 1453] Author: Anton Mitrofanov Date: Fri Feb 19 10:45:22 2010 -0800 Faster, more accurate psy-RD caching Keep more variants of cached Hadamard scores and only calculate them when necessary. Results in more calculation, but simpler lookups. Slightly more accurate due to internal rounding in SATD and SA8D functions. commit 5c767904662ccb4703b421308d7270712f60b65b [revision 1452] Author: Fiona Glaser Date: Thu Feb 18 17:01:38 2010 -0800 Much faster and more efficient MVD handling Store MV deltas as clipped absolute values. This means CABAC no longer has to calculate absolute values in MV context selection. This also lets us cut the memory spent on MVDs by a factor of 2, speeding up cache_mvd and reducing memory usage by 32*threads*(num macroblocks) bytes. On a Core i7 encoding 1080p, this is about 3 megabytes saved. commit 1ec69befa2f71f130a4a27dc2c1670489efe452d [revision 1451] Author: Fiona Glaser Date: Thu Feb 18 10:37:57 2010 -0800 Add temporal predictor support to interlaced encoding 0.5-1% better compression in interlaced mode commit 26a341ce0b7d78da9da1e899d716c5d73f626388 [revision 1450] Author: Fiona Glaser Date: Wed Feb 17 22:41:16 2010 -0800 Keep track of macroblock partitions Allows vastly simpler motion compensation and direct MV calculation. commit eafb549997bc39e7b27e3edf8b7c11518f78735e [revision 1449] Author: Fiona Glaser Date: Tue Feb 16 10:13:33 2010 -0800 Much faster and simpler direct spatial calculation commit e5114f20bca3a282c0e7ee1267fc405e10df49af [revision 1448] Author: Fiona Glaser Date: Sun Feb 21 14:21:26 2010 -0800 SimpleBlock requires Matroska Doctype v2 commit 681ff2b0749fdd2f27356a942a5eeff24d70c1b7 [revision 1447] Author: Yusuke Nakamura Date: Tue Feb 16 11:05:21 2010 -0800 Add GPAC version check commit 930d7c11f60dd5656302bfc97865eac0ffac921e [revision 1446] Author: Fiona Glaser Date: Tue Feb 23 01:44:44 2010 -0800 Fix stupid regression in interlaced in r1430 With ref > 8 or b-pyramid, an array over-read could cause slightly incorrect B-frames. commit 43d3d921188b114fb8286a9f07370c23294f54f5 [revision 1445] Author: Fiona Glaser Date: Mon Feb 22 13:04:47 2010 -0800 Fix overread of scratch buffer Could cause crashes on non-mod16 frames. commit 466754b8dc7bdb6b32474d1feac8ff0e1451aefb [revision 1444] Author: Fiona Glaser Date: Mon Feb 22 11:21:51 2010 -0800 Fix integer overflow in chroma SSD check Could cause bad skips at very high quantizers on extreme inputs. commit 8b333f51feab0c90387294895466b1941d06acc2 [revision 1443] Author: Anton Mitrofanov Date: Tue Feb 16 09:41:55 2010 -0800 Fix I and B-frame QPs with threads Rounding errors resulted in slightly wrong QPs with threads enabled. commit 6703f0249c3ae8b7ccd799199051b41cb21761aa [revision 1442] Author: David Conrad Date: Mon Feb 15 01:02:46 2010 -0800 Fix compilation on ARM commit 699b38e0dac5d9e2840ab278a8751de46dd5598b [revision 1441] Author: Loren Merritt Date: Thu Jan 28 18:09:07 2010 +0000 Remove unnecessary PIC support macros yasm has a directive to enable PIC globally commit 6953f9eedfa9ce625efc9f6afb5b76518429198c [revision 1440] Author: Fiona Glaser Date: Sat Feb 13 11:19:38 2010 -0800 Don't even try direct temporal when it would give junk MVs In PbBbP pyramid structure, the last "b" cannot use temporal because L0Ref0(L1Ref0) != L0Ref0. Don't even bother analyzing it, just use spatial. Should improve speed and direct auto effectiveness in CRF and 1-pass modes when b-pyramid is used. Also makes --direct temporal useful with --b-pyramid, since it will fall back to spatial for frames where temporal is broken. commit 04996dfb749955610daeb9a35bf3e1230ead460a [revision 1439] Author: David Conrad Date: Sun Oct 4 07:24:42 2009 -0400 iPhone compilation support Also add --sysroot to configure options To build for iPhone 3gs / iPod touch 3g: CC=/Developer/Platforms/iPhoneOS.platform/Developer/usr/bin/gcc ./configure --host=arm-apple-darwin --sysroot=/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS3.0.sdk For older devices, add --extra-cflags='-arch armv6 -mcpu=arm1176jzf-s' --extra-ldflags='-arch armv6' --disable-asm commit b46cec4f0128df6dc5cea0fca4d73671fe11bbdc [revision 1438] Author: David Conrad Date: Fri Jan 8 22:40:09 2010 -0500 ARM NEON versions of weightp functions commit 40d7215682b25d7e39d466b7277f06be88551672 [revision 1437] Author: David Conrad Date: Sat Feb 13 01:25:56 2010 -0800 Use #ifdef instead of #if in checkasm commit fc94a28317cf760ea6dc2007ac5f5de683d2d376 [revision 1436] Author: Fiona Glaser Date: Sat Feb 13 00:52:31 2010 -0800 Make the ABR buffer consider the distance to the end of the video Should improve bitrate accuracy in 2-pass mode. May also slightly improve quality by allowing more variation earlier-on in a file. Also fix abr_buffer with 1-pass: it does something very different than what it does for 2-pass. Thus, the earlier change that increased it based on threads caused 1-pass ABR to be somewhat less accurate. commit 2e9ec3f66aff3b4bae016ad1c3f26d96c7f9c9cd [revision 1435] Author: Alexander Strange Date: Sat Feb 13 02:22:04 2010 -0500 Mark cli_input/output_t variables as const when possible commit e49414918c8ed3ee38af734ae942292c8380ce87 [revision 1434] Author: Alexander Strange Date: Sat Feb 13 02:00:57 2010 -0500 mkv: Write the x264 version into the file header This only updates the "writing application"; matroska_ebml.c is the "muxing application", but the version string for that is still hardcoded. commit 18db4871662008c789980d20b869418e4b08574d [revision 1433] Author: Alexander Strange Date: Sat Feb 13 01:41:41 2010 -0500 mkv: Write SimpleBlock instead of Block for frame headers mkvtoolnix writes these by default since 2009/04/13. Slightly simplifies muxer and allows 'mkvinfo -s' to show B-frames as 'B' (but not B-ref frames). commit 50c78eaea6f8a836dbbbc92f16c118c9fa7e58df [revision 1432] Author: Alexander Strange Date: Mon Nov 10 00:55:20 2008 -0500 Allow | as a separator between psy-rd and psy-trellis values. [,:/] are all taken when setting psy-trellis in a zone in an mencoder option. Also fix a comment typo and remove a useless line of code. commit b1c4cf9841beb88229da07ed60d7f1f394dfe341 [revision 1431] Author: Fiona Glaser Date: Fri Feb 12 21:15:12 2010 -0800 Backport various speed tweak ideas from ffmpeg Add mv0 early termination to spatial direct calculation Up to twice as fast direct mv calculation on near-motionless video. Branchless CAVLC level code adjustment based on trailing ones. A few clocks faster. Check tc value before clipping in C version of deblock functions. Much faster, but nobody uses those anyways. Thanks to Michael Niedermayer for the ideas. commit 0ee8e84ed0ccc302c74f2c20a68969cfaa8f6951 [revision 1430] Author: Fiona Glaser Date: Fri Feb 12 03:33:54 2010 -0800 Implement direct temporal + interlaced This was much easier than I expected. It will also be basically useless until TFF/BFF support gets in, since it requires delta_poc_bottom to be set correctly to work well. commit 8a57269d7ca3547f860568427423357166ba56c1 [revision 1429] Author: Fiona Glaser Date: Wed Feb 10 13:44:28 2010 -0800 Allow longer keyints with intra refresh If a long keyint is specified (longer than macroblock width-1), the refresh will simply not occur all the time. In other words, a refresh will take place, and then x264 will wait until keyint is over to start another refresh. commit 282bbbc5ff53aec253c5076a3a83bd19ba4e9104 [revision 1428] Author: Fiona Glaser Date: Wed Feb 10 12:12:29 2010 -0800 Overhaul sliced-threads VBV Make predictors thread-local and allow each thread to poll the others to get their predicted sizes. Many, many other tweaks to improve quality with small VBV and sliced threads. Note this may somewhat increase the risk of a VBV underflow in such extreme situations (single-frame VBV). This is tolerable, as most relevant use-cases are better off with a few rare underflows (even if they have to drop a slice) than consistent low quality. commit 50582675b0f20b923e629fb7e245900459e1e0b2 [revision 1427] Author: Fiona Glaser Date: Tue Feb 9 15:08:31 2010 -0800 Print psy-(rd|trellis) with more precision in userdata SEI commit fd189536dae73a14ae7cf4217fc6473dfcf5ddcb [revision 1426] Author: Fiona Glaser Date: Mon Feb 15 00:55:16 2010 -0800 More formatting fixes in x264 help commit 346f1679273b1235795145008f6390c291e89577 [revision 1425] Author: Henrik Gramner Date: Mon Feb 8 15:53:52 2010 -0800 Faster 2x2 chroma DC dequant commit e2c56268167522f6eaa0d3cc5fd38d11a7b48b1d [revision 1424] Author: Yusuke Nakamura Date: Mon Feb 8 01:48:38 2010 -0800 Write PASP atom in mp4 muxing Adds container-level aspect ratio support for mp4. commit 46819d56855b9a67efed6b164ad732dea86632f0 [revision 1423] Author: Fiona Glaser Date: Wed Feb 3 20:27:57 2010 -0800 Fix 2-pass ratecontrol continuation in case of missing statsfile Didn't work properly if MB-tree was enabled. commit 50b4cfbfec180b75d2a8dcaea9da502b4c5bbef4 [revision 1422] Author: Fiona Glaser Date: Wed Feb 3 20:01:16 2010 -0800 Smarter QPRD Catch some cases in which RD checks can be avoided; reduces QPRD RD calls by 10-20%. commit d9b6077d2be991628670c6e2780a403820fa5de7 [revision 1421] Author: Fiona Glaser Date: Wed Feb 3 18:36:44 2010 -0800 Fix subpel iteration counts with B-frame analysis and subme 6/8 Since subme 6 means "like subme 5, except RD on P-frames", B-frame analysis shouldn't use the RD subpel counts at subme 6. Similarly with subme 8. Slightly faster (and very marginally worse) compression at subme 6 and 8. commit fc6bc8c17cb2f24b320a2db72daf300b2b46d1ca [revision 1420] Author: Fiona Glaser Date: Wed Feb 3 18:19:29 2010 -0800 Simplify decimate checks in macroblock_encode Also fix a misleading comment. commit 29dd5ef2e5069f900e5a8730e05d2ed35dcf8c02 [revision 1419] Author: Fiona Glaser Date: Tue Feb 2 03:15:18 2010 -0800 Improve bidir search, fix some artifacts in fades Modify analysis to allow bidir to use different motion vectors than L0/L1. Always try the <0,0,0,0> motion vector for bidir. Eliminates almost all errant motion vectors in fades. Slightly improves PSNR as well (~0.015db). commit 27043c6b0a245859eab7d442a3d7a26cb9ba839e [revision 1418] Author: Fiona Glaser Date: Mon Feb 1 13:04:47 2010 -0800 Slightly faster predictor_difference_mmxext commit 34c42187c53f22de8b8ca90acfc3c7df9367ce7a [revision 1417] Author: Fiona Glaser Date: Fri Jan 29 02:40:41 2010 -0800 Add ability to adjust ratecontrol parameters on the fly encoder_reconfig and x264_picture_t->param can now be used to change ratecontrol parameters. This is extraordinarily useful in certain streaming situations where the encoder needs to adapt the bitrate to network circumstances. What can be changed: 1) CRF can be adjusted if in CRF mode. 2) VBV maxrate and bufsize can be adjusted if in VBV mode. 3) Bitrate can be adjusted if in CBR mode. However, x264 cannot switch between modes and cannot change bitrate in ABR mode. Also fix a bug where x264_picture_t->param reconfig method would not always be frame-exact. Commit sponsored by SayMama video calling. commit a424adb406070fe3ca9be7d02111a9d3b26d25f3 [revision 1416] Author: Yusuke Nakamura Date: Sat Jan 30 13:53:01 2010 -0800 Fix regression in r1406 Bitrate was printed incorrectly for some input framerates. commit 202938b1f3500578ade58bb478df53210d787364 [revision 1415] Author: Loren Merritt Date: Sat Jan 30 12:01:51 2010 -0800 Fix log2f detection, include order, some gcc warnings r1413 caused crashes on any system with malloc.h. Also switch to std=c99 or std=gnu99 if supported by the compiler. Fix visualize support. commit 0d38729198fd135bb8edeff4960d421512a26f43 [revision 1414] Author: Fiona Glaser Date: Fri Jan 29 11:01:44 2010 -0800 Fix abstraction violations in x264.c No calling application--not even x264cli--should ever look inside x264_t. commit ef92d3bb5a9b80ece30fe8411a3c33c47f080ce7 [revision 1413] Author: Diogo Franco Date: Thu Jan 28 17:28:03 2010 -0800 Move -D CFLAGS to config.h commit e84726c8c2a02b094292f063625a9fdbd6c71253 [revision 1412] Author: Steven Walters Date: Thu Jan 28 17:26:40 2010 -0800 Fix stat with large file support commit 567df927f0cd559c7be37b0a19257d1ee0ec5167 [revision 1411] Author: Diogo Franco Date: Wed Jan 27 20:29:50 2010 -0800 Implement ffms2 version check Depends on ffms2 version 2.13.1 (r272). Tries pkg-config's built-in version checking first. Uses only the preprocessor to avoid cross-compilation issues. commit 453a8ee404116dc05ecff0572a6353e466f3ef45 [revision 1410] Author: Fiona Glaser Date: Wed Jan 27 19:41:27 2010 -0800 Fix implicit CBR message to only print when in ABR mode Also make it print outside of debug mode. commit 46ff5086af715f072b7cdb221ac75e7d7774f520 [revision 1409] Author: Diogo Franco Date: Wed Jan 27 13:11:08 2010 -0800 Add configure check for log2 support Some incredibly braindamaged operating systems, such as FreeBSD, blatantly ignore the C specification and omit certain functions that are required by ISO C. log2f is one of these functions that periodically goes missing in such operating systems. commit ac759e900e307c34878ca61efa240935a0ebf82b [revision 1408] Author: Diogo Franco Date: Wed Jan 27 10:12:42 2010 -0800 Add config.log support Now, if configure fails, you'll be able to see why. commit 4b4962921a6cf58837bfe438227f9e6112faeb73 [revision 1407] Author: Diogo Franco Date: Wed Jan 27 09:26:35 2010 -0800 Fix cross-compiling with lavf, add support for ffms2.pc Also update configure script to work with newest ffms. commit afc36d0b0ff867541827e3ff0f517df4cdf31fd6 [revision 1406] Author: Yusuke Nakamura Date: Tue Jan 26 16:01:54 2010 -0800 Improve DTS generation, move DTS compression into libx264 This change fixes some cases in which PTS could be less than DTS. Additionally, a new parameter, b_dts_compress, enables DTS compression. DTS compression eliminates negative DTS (i.e. initial delay) due to B-frames. The algorithm changes timebase in order to avoid duplicating DTS. Currently, in x264cli, only the FLV muxer uses it. The MP4 muxer doesn't need it, as it uses an EditBox instead. commit 453c029929af2e835b7ee66acd5eb6968df72cdc [revision 1405] Author: Anton Mitrofanov Date: Tue Jan 26 11:41:18 2010 -0800 Various threading-related cosmetics Simplify a lot of code and remove some unnecessary variables. commit a93903c6085ca95e41ed84e2d1d8d22569dd1ae4 [revision 1404] Author: Fiona Glaser Date: Mon Jan 25 11:23:55 2010 -0800 Hardcode the bs_t in cavlc.c; passing it around is a waste Saves ~1.5kb of code size, very slight speed boost. commit 91c0fd9499a7132ac599080faf55daa3b4c5c89a [revision 1403] Author: David Conrad Date: Sat Jan 23 18:05:25 2010 -0800 Fix lavf input with pipes and image sequences x264 should now be able to encode from an image sequence using an image2-style formatted string (e.g. file%02d.jpg). commit 535f0fa5eb42012e7e62d9eda61c39a11e7b0cb4 [revision 1402] Author: Fiona Glaser Date: Thu Jan 21 23:07:11 2010 -0800 Fix bitstream alignment with multiple slices Broke multi-slice encoding on CPUs without unaligned access. New system simply forces a bitstream realignment at the start of each writing function and flushes when it reaches the end. commit f4186f5e87b6b85b8bceccf5fdca50fb7f6fdfc6 [revision 1401] Author: Fiona Glaser Date: Thu Jan 21 10:00:07 2010 -0800 Merge nnz_backup with scratch buffer Slightly less memory usage. commit ee911101d844b3f0baf653a0f05bc72fa5d32488 [revision 1400] Author: Steven Walters Date: Wed Jan 20 09:00:54 2010 -0800 Use cross-prefix properly with pkg-config for cross-compiling commit f5af5f14e5d924a3b57d6bfbd1219a334771727b [revision 1399] Author: Fiona Glaser Date: Mon Jan 18 20:29:33 2010 -0800 Various performance optimizations Simplify and compact storage of direct motion vectors, faster --direct auto. Shrink various arrays to save a bit of cache. Simplify and reorganize B macroblock type writing in CABAC. Add some missing ALIGNED macros. commit c0474786d6358580dd847dd5b3bfe7f2a5465ab1 [revision 1398] Author: Fiona Glaser Date: Mon Jan 18 15:50:06 2010 -0800 Fix crash on new AMD M300 and similar CPUs Apparently these CPUs have SSE4a, but not misaligned SSE. commit 7833e7bb18f8a16949e5047f0e8d081855a59c13 [revision 1397] Author: Fiona Glaser Date: Sun Jan 17 19:11:05 2010 -0500 Fix intra refresh with subme < 6 Also improve the quality of intra masking. commit eba302801995ae5ebc22999fb5a5823ddab61f00 [revision 1396] Author: Fiona Glaser Date: Sat Jan 16 20:11:29 2010 -0500 Add support for multiple --tune options Tunes apply in the order they are listed in the case of conflicts. Psy tunings, i.e. film/animation/grain/psnr/ssim, cannot be combined. Also clarify --profile, which forces the limits of a profile, not the profile itself. commit 5bcf1378ede23b75e24d7c71690563a92708723f [revision 1395] Author: Fiona Glaser Date: Sat Jan 16 02:50:15 2010 -0500 Various bugfixes and tweaks in analysis Fix the oldest-ever bug in x264: b16x8 analysis used the wrong width for predict_mv. Fix cache_ref calls for slightly better MV prediction in bsub16x16 analysis. Make B-partition analysis consider reference frame costs. Various other minor changes. Overall very slightly improved mode decision and motion search in B-frames. commit 741ed788e905820d2a9fc892ea288350e939b78f [revision 1394] Author: Loren Merritt Date: Thu Jan 14 14:52:12 2010 -0500 More --me tesa optimizations commit 4f7b5f6c1f717b0485d990aca1c0731eefb90f7a [revision 1393] Author: Fiona Glaser Date: Thu Jan 14 10:39:10 2010 -0500 Fix typo in configure commit 0b73a76e891516f87412e89c2a298451741578ba [revision 1392] Author: Fiona Glaser Date: Thu Jan 14 00:07:30 2010 -0500 Make --fps force CFR mode commit c37a51005f305e7c339051e102fbcab266cbef83 [revision 1391] Author: Fiona Glaser Date: Wed Jan 13 20:21:31 2010 -0500 Eliminate intentional array overflow in quant matrix handling While it probably never caused problems, it was incredibly ugly and evil. commit 0210f805a696d257a714fd211c3df3457ea26ba8 [revision 1390] Author: Fiona Glaser Date: Wed Jan 13 20:16:13 2010 -0500 Faster --me tesa commit da619d5deaae712cdfc3641c8ce7a51591fdc4d5 [revision 1389] Author: Anton Mitrofanov Date: Wed Jan 13 15:44:00 2010 -0500 Fix static pthreads + dynamically linked x264 on win32 Add the necessary static pthread initialization code to a new DLLmain function. commit 229d8d76886b740d3403fc942de7e03062946dd0 [revision 1388] Author: Steven Walters Date: Tue Jan 12 22:55:10 2010 -0500 Add getopt_long to the included getopt.c Fixes option handling on OSs that have a nonworking/missing getopt (e.g. Solaris). commit 62ece1c2fef6388925edcce15f519450b2add5dd [revision 1387] Author: Fiona Glaser Date: Tue Jan 12 20:14:35 2010 -0500 Faster psy-trellis init Remove some unncessary zigzags. commit 85dc3f9fab268bda2ec626c7384a6fb2a0b146ba [revision 1386] Author: Fiona Glaser Date: Tue Jan 12 19:19:07 2010 -0500 Simplfy intra mode availability handling Slightly faster, 1.5kb smaller binary size, less code. commit 398d0eb3e86ccd1b092fa52cf1217cc58b22ddaa [revision 1385] Author: Fiona Glaser Date: Sun Jan 10 15:14:02 2010 -0500 Fix free callback, add x264_encoder_parameters function x264 would try to use the passed param struct after freeing if the param_free callback was set. Probably didn't cause any issues, as probably no programs used the callback in this location yet. A new x264_encoder_parameters function is now available in the API. This function lets the calling application grab the current state of the encoder's parameters. Use this in x264cli to ensure that the param struct used for set_param is updated with whatever changes x264_encoder_open has made to it. Patch partially by Anton Mitrofanov . commit aa48c1fbb74308fecfe5f7eceee63076479f32dd [revision 1384] Author: David Conrad Date: Sat Jan 9 01:52:33 2010 -0500 Fix x264 compilation on Apple GCC Apple's GCC stupidly ignores the ARM ABI and doesn't give any stack alignment beyond 4. commit fd1cf29494463f0dd9ac9b01158a78f7c7913a0f [revision 1383] Author: Fiona Glaser Date: Sat Jan 2 03:27:46 2010 -0500 Faster weightp motion search For blind-weight dupes, copy the motion vector from the main search and qpel-refine instead of doing a full search. Fix the p8x8 early termination, which had unexpected results when combined with blind weighting. Overall, marginally reduces compression but should potentially improve speed by over 5%. commit bc0ae2ef40289c310027902e72572e5d8990fbd8 [revision 1382] Author: Fiona Glaser Date: Thu Dec 31 13:45:27 2009 -0500 More correct padding constants for lowres planes Since lowres analysis isn't interlace-aware, we don't need to double the vertical padding for interlaced video. commit 5d40e878b75422c7b13cd5ab01ddfc2cf6b33938 [revision 1381] Author: Fiona Glaser Date: Thu Dec 31 02:57:45 2009 -0500 Fix some invalid reads caught by valgrind Temporal predictor calculation was misled by invalid reference counts for I-frames. commit cde39046222b112261179144033e7a51430783d0 [revision 1380] Author: Fiona Glaser Date: Tue Dec 22 18:59:29 2009 -0500 Periodic intra refresh Uses SEI recovery points, a moving vertical "bar" of intra blocks, and motion vector restrictions to eliminate keyframes. Attempt to hide the visual appearance of the intra bar when --no-psy isn't set. Enabled with --intra-refresh. The refresh interval is controlled using keyint, but won't exceed the number of macroblock columns in the frame. Greatly benefits low-latency streaming by making it possible to achieve constant framesize without intra-only encoding. Combined with slice-max size for one slice per packet, tests suggest effective resiliance against packet loss as high as 25%. x264 is now the best free software low-latency video encoder in the world. Accordingly, change the API to add b_keyframe to the parameters present in output pictures. Calling applications should check this to see if a frame is seekable, not the frame type. Also make x264's motion estimation strictly abide by horizontal MV range limits in order for PIR to work. Also fix a major bug in sliced-threads VBV handling. Also change "auto" threads for sliced threads to "cores" instead of "1.5*cores" after performance testing. Also simplify ratecontrol's checking of first pass options. Also some minor tweaks to row-based VBV that should improve VBV accuracy on small frames. commit 30d76a5eee9355c5d3e81fc7eae65f926dec16a9 [revision 1379] Author: Kieran Kunhya Date: Mon Dec 28 10:42:17 2009 -0500 LAVF/FFMS input support, native VFR timestamp handling libx264 now takes three new API parameters. b_vfr_input tells x264 whether or not the input is VFR, and is 1 by default. i_timebase_num and i_timebase_den pass the timebase to x264. x264_picture_t now returns the DTS of each frame: the calling app need not calculate it anymore. Add libavformat and FFMS2 input support: requires libav* and ffms2 libraries respectively. FFMS2 is _STRONGLY_ preferred over libavformat: we encourage all distributions to compile with FFMS2 support if at all possible. FFMS2 can be found at http://code.google.com/p/ffmpegsource/. --index, a new x264cli option, allows the user to store (or load) an FFMS2 index file for future use, to avoid re-indexing in the future. Overhaul the muxers to pass through timestamps instead of assuming CFR. Also overhaul muxers to correctly use b_annexb and b_repeat_headers to simplify the code. Remove VFW input support, since it's now pretty much redundant with native AVS support and LAVF support. Finally, overhaul a large part of the x264cli internals. --force-cfr, a new x264cli option, allows the user to force the old method of timestamp handling. May be useful in case of a source with broken timestamps. Avisynth, YUV, and Y4M input are all still CFR. LAVF or FFMS2 must be used for VFR support. Do note that this patch does *not* add VFR ratecontrol yet. Support for telecined input is also somewhat dubious at the moment. Large parts of this patch by Mike Gurlitz , Steven Walters , and Yusuke Nakamura . commit 8c8bfe19dfe0dd0728771594dac2141051860aef [revision 1378] Author: Fiona Glaser Date: Tue Dec 15 16:59:00 2009 -0800 More help typo fixes commit 65f988b7bb003e6503133231423a8f5192d32603 [revision 1377] Author: Loren Merritt Date: Thu Jan 14 03:07:30 2010 +0000 Fix x264_clz on inputs > 1<<31 (though x264 never generates such inputs) commit b7fa2857a9eeb3275035673f47a6d64331234816 [revision 1376] Author: Fiona Glaser Date: Sun Dec 13 03:16:04 2009 -0800 Don't do sum/ssd analysis if weightp == 1 Typo fixes in comments and help. commit f30aed6d810ef408cbf19cc6760605b0b87cbfde [revision 1375] Author: Fiona Glaser Date: Fri Dec 11 17:22:18 2009 -0800 Fix two bugs in 2-pass ratecontrol last_qscale_for wasn't set during the 2pass init code. abr_buffer was way too small in the case of multiple threads, so accordingly increase its buffer size based on the number of threads. May significantly increase quality with many threads in 2-pass mode, especially in cases with extremely large I-frames, such as anime. commit 7f0ef681aa92c585fcb3534b370c7ac60e4866ec [revision 1374] Author: Steven Walters Date: Thu Dec 10 19:48:51 2009 -0800 Avisynth-MT and 2.6 compatibility fixes Explain to the user why YV12 conversion is forced with Avisynth 2.6. Fix encoding with Avisynth-MT scripts by inserting the necessary Distributor() call; speeds such scripts back up to expected levels. commit e09a20eb39776192715025f52637edf6208488e9 [revision 1373] Author: Steven Walters Date: Wed Dec 9 16:03:19 2009 -0800 Fix zone parsing on mingw Due to MinGW evidently being in the hands of a pack of phenomenal idiots, MinGW does not have strtok_r, a basic string function. As such, remove the dependency on strtok_r in zone parsing. Previously, using zones for anything other than ratecontrol failed. commit 84ccdd3a6d1fd2193daabc75cc6299e24fb0e996 [revision 1372] Author: Fiona Glaser Date: Wed Dec 9 15:03:44 2009 -0800 More lookahead optimizations Under subme 1, don't do any qpel search at all and round temporal MVs accordingly. Drop internal subme with subme 1 to do fullpel predictor checks only. Other minor optimizations. commit bf70233e48ef64e766adc694c13526be19739b7f [revision 1371] Author: Fiona Glaser Date: Wed Dec 9 05:56:35 2009 -0800 Various minor missing changes from previous commits Boolify sliced threads too Remove unused constants from dct-a.asm Fix a few typos/minor errors in preset documentation commit 0b34d4672b1517a5b166d96d20f952122c7a09f7 [revision 1370] Author: Fiona Glaser Date: Thu Dec 10 16:52:39 2009 -0800 Fix regression in direct=auto/temporal in r1364 Bug caused rare race condition in frame reference handling. This resulted in invalid bitstreams in some B-frames and, very rarely, crashes. commit c0e6a94555942bfd0e3b51dfb2aebe695a23754f [revision 1369] Author: Fiona Glaser Date: Tue Dec 8 17:46:55 2009 -0800 Add fast pskip to x264 SEI info header commit f0ac608d00433fa6fbff95c0074f2de9e16b4a93 [revision 1368] Author: Steven Walters Date: Tue Dec 8 11:36:25 2009 -0800 Minor seeking fix with Avisynth input Seeking past the end of the input with --seek would result in the same frame being repeated over and over. commit c186d2ac9c2ac4f2157e63b5b86f5bb378ceeffd [revision 1367] Author: Fiona Glaser Date: Tue Dec 8 03:08:17 2009 -0800 Add support for MB-tree + B-pyramid Modify B-adapt 2 to consider pyramid in its calculations. Generally results in many more B-frames being used when pyramid is on. Modify MB-tree statsfile reading to handle the reordering necessary. Make differing keyint or pyramid between passes into a fatal error. commit 073d32e5801899fa516da54bf06527c0ab74dd7b [revision 1366] Author: Fiona Glaser Date: Mon Dec 7 18:34:05 2009 -0800 Use aliasing-avoidance macros in array_non_zero commit 346844afc4edc1b5c36990b207706e2fd9b815b0 [revision 1365] Author: Cleo Saulnier Date: Mon Dec 7 12:40:14 2009 -0800 MMX version of 8x8 interlaced zigzag Just as fast as SSSE3 on Nehalem (and faster on Conroe/Penryn), so remove the SSSE3 version. commit 6f221210903f1b4e06146b3cf5e618d62dfc0a8c [revision 1364] Author: Fiona Glaser Date: Mon Dec 7 00:49:41 2009 -0800 Bring back slice-based threading support Enabled with --sliced-threads Unlike normal threading, adds no encoding latency. Less efficient than normal threading, both performance and compression-wise. Useful for low-latency encoding environments where performance is still important, such as HD videoconferencing. Add --tune zerolatency, which eliminates all x264 encoder-side latency (no delayed frames at all). Some tweaks to VBV ratecontrol and lookahead (in addition to those required by sliced threading). Commit sponsored by a media streaming company that wishes to remain anonymous. commit a2380187a1a08b71d06fa5302c2356d28f4b7ffc [revision 1363] Author: Alex Jurkiewicz Date: Mon Dec 7 18:17:29 2009 -0800 Add more detailed help for presets/tunes/profiles Shows what options they represent. commit 744fd11932329fa11b444dfa195c6e669cb3061a [revision 1362] Author: Fiona Glaser Date: Sat Dec 5 03:19:44 2009 -0800 qpel RD no longer needs mbcmp_unaligned commit 9a100e51dd91ed4cf50bdad4a91b100dd838409d [revision 1361] Author: Loren Merritt Date: Wed Dec 9 00:37:09 2009 +0000 ensure that all boolean options are {0,1} so they print consistently in the options SEI commit 75b3871f90713a290be183e1436e792cef51f335 [revision 1360] Author: Fiona Glaser Date: Sat Dec 5 02:27:30 2009 -0800 Actually do r1356 Somehow commit r1356 got lost in the ether. I'm not sure how, but now it's fixed. commit 16bedf051d4a0147340b0f7de24dd68081ac2df9 [revision 1359] Author: Steven Walters Date: Fri Dec 4 12:17:56 2009 -0800 Remove some unused code from x264.c commit 5ec818320745db0e78c37cbf7db3a6ec2c6c8dfb [revision 1358] Author: Fiona Glaser Date: Thu Dec 3 15:36:52 2009 -0800 SSSE3 version of zigzag_8x8_field Slightly faster interlaced encoding with 8x8dct. Helps most on Nehalem, somewhat disappointing on Conroe/Penryn. commit f851c923d59fbaa73bf1148be3b796b043a7e187 [revision 1357] Author: Fiona Glaser Date: Wed Dec 2 19:55:45 2009 -0800 Fix crash in interlaced with >8 refs Crash introduced in weightp. commit 4aa33d658263abb40bf91438b5ec1eb93d86621f [revision 1356] Author: Fiona Glaser Date: Tue Dec 1 16:15:15 2009 -0800 Significantly faster qpel-RD Cache the results of MC, like in bidir-RD. Slightly changes output due to the necessary reordering of satd/RD calls. 5-10% faster qpel-RD. commit aaf7548eb173d622fc8e715e1319dc6d2c8e2853 [revision 1355] Author: David Conrad Date: Tue Dec 1 12:23:09 2009 -0800 Add x264 prefix to functions with ffmpeg equivalents Not important now, but will be when we add libav* input support. commit ade48a91e4d6933957a368bdab3dd7e0640925fc [revision 1354] Author: Fiona Glaser Date: Mon Nov 30 01:41:24 2009 -0800 10L in r1353 Broke mp4 output. commit 025f01dba74e6f92aa282e01cafeb8ec841af3e7 [revision 1353] Author: Steven Walters Date: Thu Nov 26 22:37:18 2009 -0800 Enhanced Avisynth input support Requires avisynth_c.h from the Avisynth API headers. Reports errors properly from Avisynth script input. Automatically construct input scripts for almost any input file. Tries ffmpegsource2, DSS2, directshowsource, and many other sourcing methods, based on the input file extension. Automatically converts to YV12. commit 979c14da90d69d05661430ace29d111efe615281 [revision 1352] Author: Fiona Glaser Date: Wed Nov 25 10:40:08 2009 -0800 Much faster weightp Move sum/ssd calculation out of lookahead and do it only once per frame. Also various minor optimizations, cosmetics, and cleanups. commit cee009ff0577582f093d01f9a88909157734858e [revision 1351] Author: Kieran Kunhya Date: Wed Nov 25 01:26:02 2009 -0800 Fix bugs in fps/timestamp handling in FLV muxer commit eaf9ab20af24900468a0ac71549a9e01d2dca92f [revision 1350] Author: Fiona Glaser Date: Tue Nov 24 22:37:02 2009 -0800 Fix bug in weightp analysis Weights weren't reset upon early terminations, so old (wrong) weights could stick around. Small compression improvement. commit a9885c789e5a4703cf43b6601cc347836484f853 [revision 1349] Author: Fiona Glaser Date: Tue Nov 24 20:24:14 2009 -0800 Minor deblocking optimization, update comments commit b02dd71289e2532bdc3f764145490d62783dd296 [revision 1348] Author: Fiona Glaser Date: Tue Nov 24 16:21:07 2009 -0800 Fix weightb with delta_poc_bottom Has no effect yet, but will be required once we add TFF/BFF signalling support in interlaced mode. Gives 0.5-0.7% better compression with proper TFF/BFF signalling. commit 12353fb902f647a264fb1fe2d584ffb4f2ed2c4f [revision 1347] Author: Fiona Glaser Date: Fri Nov 20 23:27:51 2009 -0800 Give more meaningful error if 1st/2nd pass resolution differ commit d38bce2c91c2d766f5611c6b100674ad66016e1f [revision 1346] Author: Steven Walters Date: Fri Nov 20 12:04:13 2009 -0800 Fix extremely rare deadlock with sync-lookahead Patch partially by Anton Mitrofanov. commit c86a6de2b4116fd4c83d82cf9e5dec17f2519050 [revision 1345] Author: Fiona Glaser Date: Fri Nov 20 08:04:28 2009 -0800 Only print weightp stats if there were P-frames commit 321674b7573a59ba1bdc2b0cab65a4e1e6b7c411 [revision 1344] Author: Fiona Glaser Date: Wed Nov 18 13:47:04 2009 -0800 Faster lookahead with subme=1 If it hasn't been clear already, don't use subme=1 as a "fast first pass" option. Use subme=2 instead; 1 and below now enable a fast (lower quality) lookahead mode. commit 63f7147714b37f1779dcf62138f21771368cb8e8 [revision 1343] Author: Fiona Glaser Date: Mon Nov 16 15:23:58 2009 -0800 Faster weightp analysis Modify pixel_var slightly to return the necessary information and use it for weight analysis instead of sad/ssd. Various minor cosmetics. commit 118dc81e7116b16ba1d2204a387aa88669b8e0bd [revision 1342] Author: Dylan Yudaken Date: Sun Nov 15 16:14:50 2009 -0800 Fix two issues in weightp If analysis decided on an offset of -128, x264 would create non-compliant streams. Fix some cases with nearly all intra blocks where analysis could pick very weird weights. Also add some asserts to check compliancy. commit 876c9e528bfcbf8932ac9ffb5091fb3f541ddb91 [revision 1341] Author: Alexander Strange Date: Sat Nov 14 22:16:18 2009 -0800 Allow compilation with non-Apple GCC on OS X commit ddbac0c6f6b320eb8b1731c744d7803140b0a5a3 [revision 1340] Author: Alexander Strange Date: Sat Nov 14 22:13:28 2009 -0800 Use __attribute__((may_alias)) for type-punning GCC thinks pointer casts to unions aren't valid with strict aliasing. See http://gcc.gnu.org/onlinedocs/gcc-4.4.2/gcc/Optimize-Options.html#Type_002dpunning. Also use M32() in y4m.c. Enable -Wstrict-aliasing again since all such warnings are fixed. commit 69163c3b6d8fe0b85cddd4e47c6a4bdbf6c170f9 [revision 1339] Author: Fiona Glaser Date: Sat Nov 14 19:58:46 2009 -0800 100l in deadlock fix commit 25a029458f70ac1bee8369bee321c8fdcf166f18 [revision 1338] Author: Kieran Kunhya Date: Sat Nov 14 19:01:09 2009 -0800 FLV muxing support commit b9ce3a10bacb32af12756d4104d7d8ef255c140a [revision 1337] Author: Fiona Glaser Date: Sat Nov 14 18:40:22 2009 -0800 Fix rare deadlock introduced in weightp commit de0e873567cb5bd900f72c88f2fafefb1f890a51 [revision 1336] Author: Fiona Glaser Date: Thu Nov 12 12:40:40 2009 -0800 Actually add -Wno-strict-aliasing to configure commit 45b28315b47759f29fd1605814ea361990c00dea [revision 1335] Author: Dylan Yudaken Date: Thu Nov 12 07:03:46 2009 -0800 Various weightp fixes Make weightp results match in threaded vs non-threaded mode. Fix two-pass with slow-firstpass. commit 03cb8c09553f24bf800cd47893e48b0aa91f9313 [revision 1334] Author: Fiona Glaser Date: Thu Nov 12 05:25:32 2009 -0800 Fix all aliasing violations New type-punning macros perform write/read-combining without aliasing violations per the second-to-last part of 6.5.7 in the C99 specification. GCC 4.4, however, doesn't seem to have read this part of the spec and still warns about the violations. Regardless, it seems to fix all known aliasing miscompilations, so perhaps the GCC warning generator is just broken. As such, add -Wno-strict-aliasing to CFLAGS. commit 241aacca01ee167ec632194ea0edbeffa44145df [revision 1333] Author: David Conrad Date: Wed Nov 11 20:53:49 2009 -0800 Fix 10l in weightp on ARM commit 3a4c7dae3deeeb729251e1098d70befab1ad4a0e [revision 1332] Author: Fiona Glaser Date: Mon Nov 9 21:22:41 2009 -0800 Fix one (of possibly many) miscompilations in weightp Use NOINLINE and some emms calls to fix emms reordering issues. This issue occurred with some GCC versions if threads > 1 and the phase of the moon was right. Also a cosmetic in x264.c. commit 4ed2a8e3d46a1a90df41ed0a195a3926f36d6c15 [revision 1331] Author: Fiona Glaser Date: Mon Nov 9 09:18:03 2009 -0800 Fix pixel_ssd on win64 Didn't preserve XMM registers, may or may not have caused problems. commit b305297084738ef84a3c60d53f001734b1dd96f5 [revision 1330] Author: Steven Walters Date: Sun Nov 8 22:18:35 2009 -0800 Fix weightp logfile parsing on MinGW commit df732ec7f119a02fb124261d588151026114d43d [revision 1329] Author: Loren Merritt Date: Mon Nov 9 05:27:29 2009 +0000 cosmetics commit 094110915e2de3410feca47463bae4a8b28f587e [revision 1328] Author: David Conrad Date: Sun Nov 8 20:12:54 2009 -0800 Fix weightp on ARM + PPC No ARM or PPC assembly yet though. commit ccac8546bc1596f2313c3f472caab720c9753275 [revision 1327] Author: Dylan Yudaken Date: Sun Nov 8 17:59:08 2009 -0800 Weighted P-frame prediction Merge Dylan's Google Summer of Code 2009 tree. Detect fades and use weighted prediction to improve compression and quality. "Blind" mode provides a small overall quality increase by using a -1 offset without doing any analysis, as described in JVT-AB033. "Smart", the default mode, also performs fade detection and decides weights accordingly. MB-tree takes into account the effects of "smart" analysis in lookahead, even further improving quality in fades. If psy is on, mbtree is on, interlaced is off, and weightp is off, fade detection will still be performed. However, it will be used to adjust quality instead of create actual weights. This will improve quality in fades when encoding in Baseline profile. Doesn't add support for interlaced encoding with weightp yet. Only adds support for luma weights, not chroma weights. Internal code for chroma weights is in, but there's no analysis yet. Baseline profile requires that weightp be off. All weightp modes may cause minor breakage in non-compliant decoders that take shortcuts in deblocking reference frame checks. "Smart" may cause serious breakage in non-compliant decoders that take shortcuts in handling of duplicate reference frames. Thanks to Google for sponsoring our most successful Summer of Code yet! commit b06734129a221be0c7a9a66c91ad042338abcd7c [revision 1326] Author: Steven Walters Date: Sun Nov 8 11:53:48 2009 -0800 Fix assert failure in the case of forced i-frames Note that this applies to non-IDR i-frames, not IDR-frames. This fix is also required for future open-gop. commit 133ee69dff02b7db62dc99aca3f60c534c90eb34 [revision 1325] Author: Steven Walters Date: Sat Nov 7 17:07:28 2009 -0800 Fix issues relating to input/output files being pipes/FIFOs commit 53a5772a35451c897366adda72d3a44c13103c38 [revision 1324] Author: David Conrad Date: Sat Nov 7 09:25:18 2009 -0800 Various ARM-related fixes Fix comment for mc_copy_neon. Fix memzero_aligned_neon prototype. Update NEON (i)dct_dc prototypes. Duplicate x86 behavior for global+hidden functions. commit 30b3825ed00f7f88397a26760cac5248f2f8e226 [revision 1323] Author: Fiona Glaser Date: Wed Nov 4 00:03:14 2009 -0800 Fix miscompilation with gcc 4.3 on ARM Aliasing violation in spatial prediction caused nasty artifacts. Shut up two other GCC warnings while we're at it. commit d2e7a5a6bf716b2cc1ad32bb842d28935060ccc3 [revision 1322] Author: Fiona Glaser Date: Tue Nov 3 23:15:35 2009 -0800 Fix extremely rare infinite loop in 2-pass VBV Implicit conversion from double->float lost enough precision to cause the loop termination condition to never trigger. Bug report by Tal Aloni. commit f3c9e6f3e77070f2f5447ef006959e8885a38e55 [revision 1321] Author: Anton Mitrofanov Date: Sat Oct 31 19:51:14 2009 -0700 Fix large file support, broken in r1302 commit 99cf5bf62a738b05c7168f04e344d6d596c874d3 [revision 1320] Author: Fiona Glaser Date: Fri Oct 30 18:58:03 2009 -0700 Dramatically reduce size of pixel_ssd_* asm functions ~10k of code size eliminated. commit 3ddc66cc5f785f4791939975dee1244a513f2a50 [revision 1319] Author: Loren Merritt Date: Sat Nov 7 06:09:47 2009 +0000 fix bottom-right pixel of lowres planes, which was uninitialized. weirdly, valgrind reported this only with --no-asm. commit b4838a5e3ca349719227f190d79ba7e534742a72 [revision 1318] Author: Fiona Glaser Date: Thu Oct 29 12:28:37 2009 -0700 Further reduce code size in bime ~7-8 kilobytes saved, ~0.6% faster subme 9. commit ecbe2b47036d62f05edcaedea381194ae50516f3 [revision 1317] Author: Anton Mitrofanov Date: Wed Oct 28 12:57:11 2009 -0700 Fix case in which MB-tree didn't propagate all data correctly Should improve quality in all cases. Also some minor cosmetic improvements. commit a0bbef702a4aa9a36c780ab9ed3eade4e31412d4 [revision 1316] Author: Fiona Glaser Date: Tue Oct 27 16:01:46 2009 -0700 Take into account chroma MV offset during interlaced motion search Small improvement in interlaced compression. commit 98a6d134d3638785bda99e1303c00f3ce471ec63 [revision 1315] Author: Fiona Glaser Date: Tue Oct 27 15:08:37 2009 -0700 Slightly faster ssse3 width4 chroma MC Cacheline-aware in the same fashion as width8, but not conditional. commit 8dc839a6300c116faf040b2dae47b06c2920b4f8 [revision 1314] Author: Fiona Glaser Date: Tue Oct 27 14:01:46 2009 -0700 Eliminate some rare cases where MB-tree gave incorrect results in B-frames Also get rid of some unnecessary memcpies. commit 59f31c25f4c0f20358fc2ef15c2257d2b05716c2 [revision 1313] Author: Anton Mitrofanov Date: Tue Oct 27 12:28:07 2009 -0700 Fix cases in which b-adapt 1 could result in AUTO-type frames. This didn't actually cause any issues, but it removes the need for the fixing-up code that prevented said issues. commit 80a3909c1373ceceabed0f41eee366fc7de7cb1b [revision 1312] Author: Fiona Glaser Date: Mon Oct 26 12:53:07 2009 -0700 Motion compensation optimizations Turning off inlining saves a whole boatload of code size for near-zero speed cost. Simplify offset calculation. Various other optimizations. commit 9ef68adbe37c707a1195e4027ef8bbfc655b090b [revision 1311] Author: Fiona Glaser Date: Sun Oct 25 19:41:10 2009 -0700 Minor CAVLC optimizations commit d947f151e09a8c412c23b2d0800a0570d6fe6287 [revision 1310] Author: Loren Merritt Date: Sun Oct 25 19:34:12 2009 +0000 cosmetics commit 35838f77a2961f52f8eb7e9d236c0d6abb47c0fc [revision 1309] Author: Fiona Glaser Date: Sun Oct 25 09:14:27 2009 -0700 ISC-license x86inc.asm As the assembly abstraction layer is very useful in non-x264 projects, it is now ISC (simplified BSD) so that others, even in commercial projects, can use it as well. commit 2b695c6dd398c4ef16c9763dc64f57fad1e081d5 [revision 1308] Author: Fiona Glaser Date: Fri Oct 23 16:20:39 2009 -0700 Various minor CABAC optimizations commit 60fb787e49986c6b83825d59bd64866cb8be82be [revision 1307] Author: Lamont Alston Date: Fri Oct 23 11:01:13 2009 -0700 Fix bug in b-pyramid strict Bug caused invalid streams in some situations. commit 1e3729ecfaeae534162a7770479b6761d41b38b2 [revision 1306] Author: Fiona Glaser Date: Fri Oct 23 02:34:49 2009 -0700 Remove non-mod16 warning Compression only "suffers" by an extremely marginal amount and too many people misinterpret the warning. commit a7d3ceb4871dbe46c8437be014ac45d550602f9e [revision 1305] Author: Fiona Glaser Date: Thu Oct 22 22:38:32 2009 -0700 Fix two warnings + some minor optimizations commit d1df0c41c83db70090389f5d3f9b8824983c630f [revision 1304] Author: Fiona Glaser Date: Mon Oct 19 22:38:01 2009 -0700 Fix a typo in b-pyramid help And an errant space in common/macroblock.c commit 62ff2d4358372147429d295c34dfc425ddb30e58 [revision 1303] Author: Henrik Gramner Date: Mon Oct 19 12:57:47 2009 -0700 A bit more write-combining in macroblock_cache_load commit a0df454b358000eb4f5485f8d09a2620fa6c32e5 [revision 1302] Author: Steven Walters Date: Sat Oct 24 00:23:50 2009 +0000 split muxers.c into one file per format simplify internal muxer API commit d73d798ef054c36250d16f75f267091bd5b6a877 [revision 1301] Author: Fiona Glaser Date: Mon Oct 19 02:43:48 2009 -0700 Update fprofile with the latest change to b-pyramid commit ed903d902bca9ab8ddde93aef5f38ef9a7883a99 [revision 1300] Author: Steven Walters Date: Sat Oct 17 12:54:41 2009 -0700 Fix assertion fail and incorrect costs with pyramid+VBV Deal properly with QPfile'd B-refs. x264 should handle multiple B-refs per minigop now, though only via forced frametypes. commit 318298e9e1c742fb1453ce8ae6574eaed7487e65 [revision 1299] Author: Fiona Glaser Date: Sat Oct 17 03:04:56 2009 -0700 Improve CRF initial QP selection, fix get_qscale bug If qcomp=1 (as in mb-tree), we don't need ABR_INIT_QP. get_qscale could give slightly weird results with still images commit d9e6b1732e8a49054870e50990dc54e659f9e1af [revision 1298] Author: Fiona Glaser Date: Wed Oct 14 11:32:27 2009 -0700 Print more accurate error message if dump_yuv fails commit 29dba1c3446f51ddcd003e4e0998d931ddb24920 [revision 1297] Author: Steven Walters Date: Tue Oct 13 09:56:04 2009 -0700 Reduce memory usage of b-adapt 2 trellis Also fix a minor bug where the algorithm ignored the last frame in the trellis. commit cf5ba8134a4bdd381e75a5c2ea434198a7174a36 [revision 1296] Author: Lamont Alston Date: Mon Oct 12 23:32:16 2009 -0700 Make B-pyramid spec-compliant The rules of the specification with regard to picture buffering for pyramid coding are widely ignored. x264's b-pyramid implementation, despite being practically identical to that proposed by the original paper, was technically not compliant. Now it is. Two modes are now available: 1) strict b-pyramid, while worse for compression, follows the rule mandated by Blu-ray (no P-frames can reference B-frames) 2) normal b-pyramid, which is like the old mode except fully compliant. This patch also adds MMCO support (necessary for compliant pyramid in some cases). MB-tree still doesn't support b-pyramid (but will soon). commit e691cc0e3563b554e199cafbec82109d6a496c36 [revision 1295] Author: Fiona Glaser Date: Mon Oct 12 23:28:26 2009 -0700 Add missing free for nal_buffer Fixes a memory leak. commit f6431055381c311aa08bfb9335bd1adb9bd8be3e [revision 1294] Author: Loren Merritt Date: Sun Oct 18 21:47:18 2009 +0000 sync yasm macros to ffmpeg commit 040663db98c09eb819364d77059450166c294950 [revision 1293] Author: Loren Merritt Date: Sat Oct 17 14:54:49 2009 +0000 eliminate some divisions commit 744ea94e0d76db75eb111f1e8c9f4804165a6315 [revision 1292] Author: Fiona Glaser Date: Mon Oct 12 18:40:28 2009 -0700 Fix glitches with slow-firstpass + weightb + multiref + 2pass Bug in r1277 commit 84999edb42023a75aadffe356cce538834dee84b [revision 1291] Author: Henrik Gramner Date: Mon Oct 12 15:44:13 2009 -0700 Simplify some code in b-adapt 2's trellis commit d421ce5d657cdd8e200ac003cbfcffb45d6a388d [revision 1290] Author: Fiona Glaser Date: Mon Oct 12 15:38:51 2009 -0700 Fix a very rare integer overflow in slicetype analysis Caused an assert failure when it occurred. Bug is as old as adaptive B-frames. commit 07cfdf8468fbba99784f5faa5230cca34e149a29 [revision 1289] Author: Fiona Glaser Date: Mon Oct 12 13:14:19 2009 -0700 Reduce the aggressiveness of 2-pass VBV Now that B-frames are properly covered, we don't have to be as aggressive. This eliminates some issues with skyrocketing QPs in B-frames in 2-pass VBV. commit a0b07e91d128b91eb6bef7189e79a3f14f39af3d [revision 1288] Author: Fiona Glaser Date: Mon Oct 12 11:29:23 2009 -0700 Fix regression: disable flash detection without B-frames commit 1fbba0ca5d97d4f3250864c5cc6431c69855cb59 [revision 1287] Author: Loren Merritt Date: Sat Oct 10 04:43:00 2009 +0000 change all dct arrays to 1d. the C standard doesn't allow you to iterate 1-dimensionally over 2d arrays, and nothing other than the dsp functions themselves cares about the 2dness of dct. this fixes a miscompilation in x264_mb_optimize_chroma_dc. commit 507c83428027ba0886168b55324af5f1d5befbdd [revision 1286] Author: Fiona Glaser Date: Sun Oct 11 20:17:50 2009 -0700 Add row-based VBV for B-frames While B-frames still aren't explicitly covered by ratecontrol, this should resolve issues of VBV underflows due to larger-than-expected B-frames. commit c51d00b7c742aa84ac7e113ba03d808cb3132af2 [revision 1285] Author: Fiona Glaser Date: Sat Oct 10 17:35:03 2009 -0700 Improve VBV, fix bug in 2-pass VBV introduced in MB-tree Bug caused AQ'd row/frame costs to not be calculated (and thus caused underflows). Also make VBV more aggressive with more threads in 2-pass mode. Finally, --ratetol now affects VBV aggressiveness (higher is less aggressive). commit 1a1b9c6f9b35025223b4a7ca68af4ec95ede8f79 [revision 1284] Author: Anton Mitrofanov Date: Thu Oct 8 14:55:26 2009 -0700 Optimize exp2fix8 Slightly faster and more accurate rounding. commit c695f52485f11445c981f7a7b2e1a485ebec2d6a [revision 1283] Author: Fiona Glaser Date: Thu Oct 8 04:27:11 2009 -0700 Avoid scenecuts in flashes and similar situations "Flashes" are defined as any scene which lasts a very short period before a previous scene returns. A common example of this is of course a camera flash. Accordingly, look ahead during scenecut analysis and rule out the possibility of certain frames being scenecuts. Also handles cases of tons of short scenes in sequence and avoids making those scenecuts as well. Can only catch flashes of 1 frame in length with b-adapt 1. With b-adapt 2, can catch flashes of length --bframes. Speed cost should be negligible. commit 3b81316490e58524d3f86f2439cc8cfa2355eac3 [revision 1282] Author: Fiona Glaser Date: Tue Oct 6 22:15:10 2009 -0700 Fix bug where x264 generated non-compliant bitstreams with insane SAR values commit 6e8487f4ea6c761f3ddc14766dd254790f6c8e9e [revision 1281] Author: Loren Merritt Date: Wed Sep 30 22:39:13 2009 +0000 rm msvc project files and related ifdefs commit e9fbd8db8908074f46005383bf0c117d5fc4c8a8 [revision 1280] Author: Holger Lubitz Date: Tue Oct 6 15:17:34 2009 -0700 SSE4 version of 4x4 idct 27->24 clocks on Nehalem. This is really just an excuse to use "movsd" in a real function. Add some comments to subsum-related macros in x86util. commit 7639d496ccc83f28166471d3a2a54292110f572c [revision 1279] Author: Fiona Glaser Date: Sun Oct 4 19:15:28 2009 -0700 Constrained intra prediction support Enable with --constrained-intra. Significantly reduces compression, but required for the base layer of SVC encodes and maybe some other use-cases. Commit sponsored by a media streaming company that wishes to remain anonymous. commit 8270136f6ec2fc72087d1e8f15eed849300768e6 [revision 1278] Author: Fiona Glaser Date: Sun Oct 4 00:48:27 2009 -0700 Slightly improve non-RD p8x8 mode decision Subpartition costs are effectively zero in CABAC if sub-8x8 search is off. commit c1322c3198f981adf2e1a4221afdba9cfdc9345c [revision 1277] Author: Fiona Glaser Date: Sat Oct 3 00:59:02 2009 -0700 Reorder reference frames optimally on second pass About +0.1-0.2% compression at normal bitrates, up to +1% at very low bitrates. Only works if the first pass uses the same number of refs as the second (i.e. not with fast first pass). Thus, only worthwhile at insanely slow speeds: as such, enable slow-firstpass by default with preset placebo. Note that this changes the stats file format! commit deae6910e183789705532e6c94eba6dada3b9b00 [revision 1276] Author: Fiona Glaser Date: Wed Sep 30 12:13:16 2009 -0700 Fix typo in ratecontrol_summary commit 9dd6842dc649734219b1207481c6746bbc6e2198 [revision 1275] Author: Fiona Glaser Date: Tue Sep 29 23:32:07 2009 -0700 Clip log2_max_frame_num It's still much higher than it needs to be, but that will be fixed with the upcoming MMCO patch. Also make sure we don't write too large a frame_num or poc in slice header. commit d73b50e86b0d6beaa918c2855771002b19ded523 [revision 1274] Author: Anton Mitrofanov Date: Sat Sep 26 12:44:53 2009 -0700 Fix some issues with 3-pass statsfile handling The value of i_frame during encoder_close was incorrect. commit c4597c9684307df1fab0d76461eb914d031e8182 [revision 1273] Author: Anton Mitrofanov Date: Sat Sep 26 12:42:46 2009 -0700 Fix ctrl-C termation message with few frames encoded commit 24ef8748abb957fc4807299d4346779e11ac6c57 [revision 1272] Author: Fiona Glaser Date: Fri Sep 25 16:23:52 2009 -0700 Add support for single-frame VBV, improve compliance This allows both constant-framesize and capped-framesize encoding. Literal constant framesize isn't actually supported yet due to the lack of filler support. Example with 30fps video: --vbv-bufsize 200 --vbv-maxrate 6000 will ensure that no frame is ever larger than 200 kilobits. One example use-case of this is for zero-delay streaming where bandwidth costs need to be minimized. If every frame is smaller than 200 kilobits and the client has a 6 megabit connection, every single frame can be instantly sent to the client and handled without any decoder-side buffer. Fix a mistake in VBV calculation--this may have caused the VBV to be slightly non-compliant in some situations without x264 realizing it. Add primitive prediction handling for rows with quantizers lower than their reference. This slightly improves VBV in CBR mode. Various other minor improvements to VBV, mostly to make single-frame VBV work. Commit sponsored by a media streaming company that wishes to remain anonymous. commit e324d60ab2d9fd9cb5c837039a8c48e2052d1947 [revision 1271] Author: Fiona Glaser Date: Thu Sep 24 08:40:45 2009 -0700 Fix 10l in API change frame_num was set to 1, not 0, for the first frame. This broke spec compliance. Didn't actually seem to cause any problems though except for breaking decoding on Quicktime. commit 17fcf96e7e19ec391393c0ba2a67cd6f792131a5 [revision 1270] Author: Steven Walters Date: Wed Sep 23 15:04:02 2009 -0700 Allow user-set FPS for inputs other than YUV commit e0920d6fac5b51ab6ebc08482b2eace3a667cc1c [revision 1269] Author: Anton Mitrofanov Date: Wed Sep 23 12:31:53 2009 -0700 Improve threaded frame handling Avoid unnecessary cond_wait commit 510fa4fc25ac74d47ba5dc5c82aba45c8944afde [revision 1268] Author: Fiona Glaser Date: Tue Sep 22 17:30:19 2009 -0700 Attempt to detect miscompilation due to bug in gcc 4.2 I don't know if this bug still affects latest x264, but it can't hurt to try to detect it. Accordingly refuse to open the encoder if detected. Apparently VLC (on Windows) has been distributed for some time with a completely broken x264 due to the use of a completely broken compiler (gcc 4.2). In particular, the MV costs seem to be calculated incorrectly on win32 when linking from an application compiled without -ffast-math to an application with -ffast-math. I am not entirely certain why this occurs, but the result is, unsurprisingly, encoding quality that makes MPEG-2 look good, due to the motion search being completely broken. commit b454edb2c32910ec021ac46405c87b0ad0b1ee3b [revision 1267] Author: Steven Walters Date: Tue Sep 22 12:14:23 2009 -0700 Really fix encoder_close crash this time Not-entirely-fixed in r1253. commit a54f4f2b77c7f77cb86232a291c802c1d993f7e7 [revision 1266] Author: Fiona Glaser Date: Sun Sep 20 21:58:08 2009 -0700 Check for 16x16 partitions masquerading as smaller ones Saves a few bits when using qpel-RD. commit 2fe90066553da8a2e158259bd3e20939b3778b9d [revision 1265] Author: David Conrad Date: Sun Sep 20 01:16:51 2009 -0700 Update config.guess/sub; add Snow Leopard support commit 9e6650e9b523db04fa916af69bf5cfaa9fee6c4e [revision 1264] Author: Fiona Glaser Date: Sat Sep 19 09:50:59 2009 -0700 Fix integer overflow in 2-pass VBV Bug caused slight undersizing in 2-pass mode in some cases. commit c4c49802a61dd247798f50fd18c9449fcfb06977 [revision 1263] Author: Fiona Glaser Date: Fri Sep 18 14:28:31 2009 -0700 Fix bug with various bizarre commandline combinations and mbtree Second pass would have mbtree on even though the first pass didn't (and thus encoding would immediately fail). commit bbf573c75455ea02ea18bd718b65cd13a1d9a04c [revision 1262] Author: Fiona Glaser Date: Thu Sep 17 13:02:02 2009 -0700 Add intra prediction modes to output stats Also eliminate some NANs in stat output with intra-only encoding. Marginal speedup: disable stat calculation if log level is below X264_LOG_INFO. Various minor cosmetics. commit 90f12afa4759bc4c0dff4ebec41707a3146f6b8b [revision 1261] Author: Fiona Glaser Date: Wed Sep 16 21:34:48 2009 -0700 Overhaul syntax in muxers.c/matroska.c The inconsistent syntax in these files has finally come to an end. commit 7a0fbed78235a63bf8008d282f5db64ef1f3f2ec [revision 1260] Author: Fiona Glaser Date: Wed Sep 16 20:00:00 2009 -0700 Major API change: encapsulate NALs within libx264 libx264 now returns NAL units instead of raw data. x264_nal_encode is no longer a public function. See x264.h for full documentation of changes. New parameter: b_annexb, on by default. If disabled, startcodes are replaced by sizes as in mp4. x264's VBV now works on a NAL level, taking into account escape codes. VBV will also take into account the bit cost of SPS/PPS, but only if b_repeat_headers is set. Add an overhead tracking system to VBV to better predict the constant overhead of frames (headers, NALU overhead, etc). commit 8e67a586e02672ef7faf001bae1813200f8fb730 [revision 1259] Author: Fiona Glaser Date: Mon Sep 14 12:30:38 2009 -0700 Add missing fclose for mbtree input statsfile on second pass Bug report by VFRmaniac commit f81f14e23534c199c20d78e553a7e427a9cf2d8a [revision 1258] Author: Fiona Glaser Date: Mon Sep 14 11:07:23 2009 -0700 Improve progress indicator behavior Progress indicator will now indicate based on output frame, not input frame. commit 3f3b67f74b53e18744b1bf754d12b6eaae9dd3c5 [revision 1257] Author: Fiona Glaser Date: Mon Sep 14 03:21:14 2009 -0700 Update yasm configure check lzcnt apparently requires yasm 0.6.2. commit b1eac26510d0532ae9202249767e5f3ba22443ef [revision 1256] Author: Fiona Glaser Date: Sun Sep 13 01:02:37 2009 -0700 Make MV costs global instead of static Fixes some extremely rare threading race conditions and makes the code cleaner. Downside: slightly higher memory usage when calling multiple encoders from the same application. commit c8c060798aa0a43cd334f78b62fd23720024de9f [revision 1255] Author: Fiona Glaser Date: Fri Sep 11 17:30:14 2009 -0700 Don't print scenecut message multiple times in verbose mode Occurred mostly with b-adapt 2. commit 72fa3f9bbd5855ecfc2de1f1b7b1861cd2e20a21 [revision 1254] Author: Fiona Glaser Date: Thu Sep 10 02:55:21 2009 -0700 Optimize rounding of luma and chroma DC coefficients Reduce bitrate mostly-losslessly at low quantizers. In some rare cases, bitrate reduction may be as high as 10%. Luma rounding optimization (helps much less than chroma) requires trellis. commit 9e15b6d8e4b0c927c8ebf0abc75c4467437a91f9 [revision 1253] Author: Steven Walters Date: Wed Sep 9 12:19:40 2009 -0700 Fix crash if encoder_close is called before delayed frames are flushed Also no longer flush frames when ctrl-Cing x264, so x264 will close faster. commit 02e662e1818a7b83c3f8120b06ccbaa378a7c58e [revision 1252] Author: Fiona Glaser Date: Sun Sep 6 14:55:48 2009 -0700 Improve x264 help Now has three help options: --help, --longhelp, and --fullhelp. --help only shows the most basic options; most users should not need more than these. Add usage examples. Fix typo in a comment. commit d1f4237e0b5ba0718e88ef5529567872ea82477a [revision 1251] Author: Fiona Glaser Date: Sat Sep 5 19:22:21 2009 -0700 Factor out a redundant RD call in qpel-RD Fixes a problem that was supposed to be, but didn't, get fully fixed in r1238. commit 5858d3dc48dfaacdb608659aaf8721958327b26d [revision 1250] Author: Fiona Glaser Date: Sat Sep 5 18:56:18 2009 -0700 Fix RD early-skip Small quality improvement and speedup, was broken by r1214. commit 6093a383fb0a2f82aeaf249841797ea4d4e88e1d [revision 1249] Author: Fiona Glaser Date: Sat Sep 5 18:55:46 2009 -0700 Faster CAVLC mb header writing for B macroblocks commit de4c39b71d013a87aea50ed1075263dc1e579b01 [revision 1248] Author: David Conrad Date: Wed Sep 2 16:14:59 2009 -0700 Compile fixes for pre-ARMv6T2 and/or PIC commit bc120190edf7db86f44ed44ffad31271ad1294c7 [revision 1247] Author: Steven Walters Date: Wed Sep 2 12:33:50 2009 -0700 Change priority handling on some OSs Instead of setting the lookahead thread to max priority, lower all the other threads' priorities instead. This is particularly useful when the "max priority" is "realtime", as in Windows, which can cause some problems. commit 6940dcaef140d8a0c43c9a62db158e9d71a8fdeb [revision 1246] Author: Steven Walters Date: Tue Sep 1 18:46:51 2009 -0700 Threaded lookahead Move lookahead into a separate thread, set to higher priority than the other threads, for optimal performance. Reduces the amount that lookahead bottlenecks encoding, greatly increasing performance with lookahead-intensive settings (e.g. b-adapt 2) on many-core CPUs. Buffer size can be controlled with --sync-lookahead, which defaults to auto (threads+bframes buffer size). Note that this buffer is separate from the rc-lookahead value. Note also that this does not split lookahead itself into multiple threads yet; this may be added in the future. Additionally, split frames into "fdec" and "fenc" frame types and keep the two separate. This split greatly reduces memory usage, which helps compensate for the larger lookahead size. Extremely special thanks to Michael Kazmier and Alex Giladi of Avail Media, the original authors of this patch. commit 7df6f5d62983432414016f5ec18f71f17626354e [revision 1245] Author: Fiona Glaser Date: Tue Sep 1 11:36:54 2009 -0700 Force a link error in case of incompatible API This is because the number of bug reports due to miscompiled ffmpeg builds is reaching critical mass. The name of x264_encoder_open is now #defined based on the current X264_BUILD. Note that this changes the calling convention required for dlopen, but not for ordinary calls to x264_encoder_open. commit ec2f6f4f93df9fb7c67a172669a2b629335391d5 [revision 1244] Author: Fiona Glaser Date: Mon Aug 31 22:44:45 2009 -0700 Get rid of "CBR" descriptor from qcomp Though technically accurate in some vague way, I have never actually seen this option used correctly, rather it has been used by hundreds of people who can't read the documentation and believe that qcomp=0 is what should be used for CBR encoding. commit 4767b0e12e335f6057327a18992a3c97abedabbb [revision 1243] Author: Loren Merritt Date: Sun Aug 30 20:49:07 2009 +0000 Faster me=tesa But it still spends all too much time in me_search_ref rather than asm. commit 4ccbb1998c81c5533c17da91aa67b62a5d9857c8 [revision 1242] Author: Fiona Glaser Date: Mon Aug 31 06:36:41 2009 -0700 Multi-slice encoding support Slicing support is available through three methods (which can be mixed): --slices sets a number of slices per frame and ensures rectangular slices (required for Blu-ray). Overridden by either of the following options: --slice-max-mbs sets a maximum number of macroblocks per slice. --slice-max-size sets a maximum slice size, in bytes (includes NAL overhead). Implement macroblock re-encoding support to allow highly accurate slice size limitation. Might be useful for other things in the future, too. commit 57223706e5d32df207e9b3f64e2b36c4c3b78022 [revision 1241] Author: Fiona Glaser Date: Sat Aug 29 17:09:55 2009 -0700 Fix a valgrind warning in b-adapt 2 commit 22342aa3bfedb2ad29fde8d236145db2021614dc [revision 1240] Author: Loren Merritt Date: Sat Aug 29 10:31:08 2009 +0000 fix asm symbols for oprofile (regression in r1221) commit 5c08b9142d327c8ba910c2b399d804f6794182a5 [revision 1239] Author: Anton Mitrofanov Date: Fri Aug 28 15:07:12 2009 -0700 Fix bug in intra analysis in B-frames i8x8/i4x4 never got analysed when fast_intra was toggled and RD was off; up to a 2-3% quality improvement in non-RD mode. With this bug dating back to r369, this is probably the second-oldest bug ever fixed in x264. commit 3c3239bb8d7d89a9879502256bcde6066fef7cb0 [revision 1238] Author: Anton Mitrofanov Date: Fri Aug 28 14:56:44 2009 -0700 Fix bug in b16x16 qpel RD Incorrect cost was used to initialize the search. commit af2739b786fb702018f0d0266dfc40d81a32162c [revision 1237] Author: Fiona Glaser Date: Thu Aug 27 15:21:22 2009 -0700 Check minimum chroma QP in addition to luma QP during CQM init Correctly error out if the implied minimum chroma QP is too low. Add missing emms to checkasm macroblock_tree_propagate test. commit 65068aab7e2c1b923be766951f684027923ac4d6 [revision 1236] Author: Fiona Glaser Date: Thu Aug 27 14:16:45 2009 -0700 Faster mbtree propagate and x264_log2, less memory usage Avoid an int->float conversion with a small table. Change lowres_inter_types to a bitfield; cut its size by 75%. Somewhat lower memory usage with lots of bframes. Make log2/exp2 tables global to avoid duplication. commit adc25db91ebef53b7883bb1587df1dd2247e4f21 [revision 1235] Author: Fiona Glaser Date: Wed Aug 26 20:30:47 2009 -0700 Fix keyint=1 + VBV + rc-lookahead commit 2d3958bfda22a24f54095007b25eb96d521086f5 [revision 1234] Author: Fiona Glaser Date: Wed Aug 26 20:16:10 2009 -0700 Faster x264_exp2fix8 22->13 cycles on Core 2 with mfpmath=sse commit 252fcf4b0a1b0318ec246f45bd934efac9de9c50 [revision 1233] Author: Loren Merritt Date: Thu Aug 27 06:05:57 2009 +0000 compile x86 with fpmath=sse by default commit efa85578a7a19f3f71b0bcae194cfbfb10f2f319 [revision 1232] Author: David Conrad Date: Mon Aug 24 17:17:41 2009 -0700 ARM configure: enable NEON-related options by default When compiling for ARM, x264 will compile by default for Cortex A8 unless specified otherwise. To compile for pre-ARMv6, --disable-asm is required. commit 918808f897cb7e1e401b1f1cba560957985f1682 [revision 1231] Author: Fiona Glaser Date: Mon Aug 24 03:28:11 2009 -0700 2-pass VBV fixes Properly run slicetype frame cost with 2pass + MB-tree. Slash the VBV rate tolerance in 2-pass mode; increasing it made sense for the highly reactive 1-pass VBV algorithm, but not for 2-pass. 2-pass's planned frame sizes are guaranteed to be reasonable, since they are based on a real first pass, while 1-pass's, based on lookahead SATD, cannot always be trusted. commit 50d7fb80d8cb773cd6d495e083867c3685726352 [revision 1230] Author: David Conrad Date: Mon Aug 24 01:38:42 2009 -0700 GSOC merge part 8: ARM NEON intra prediction assembly functions (partial) 4x4 dc/h/ddr/ddl, 8x8 dc/h, 8x8c h/v, 16x16 dc/h/v commit 350a558808816ad54d3ad01d795a43920738f586 [revision 1229] Author: David Conrad Date: Mon Aug 24 01:10:30 2009 -0700 GSOC merge part 7: ARM NEON deblock assembly functions (partial) Originally written for ffmpeg by Mans Rullgard; ported by David. Luma and chroma inter deblocking; no intra yet. commit 2dcc6072d12deaf27705dc2beb63e192bd590232 [revision 1228] Author: David Conrad Date: Mon Aug 24 00:58:42 2009 -0700 GSOC merge part 6: ARM NEON quant assembly functions (partial) (de)quant 4x4, (de)quant 8x8, (de)quant DC, coeff_last commit a591e8856ee2b919d21bcc51e5eb88e9f4fb6d94 [revision 1227] Author: David Conrad Date: Sun Aug 23 02:03:48 2009 -0700 GSOC merge part 5: ARM NEON dct assembly functions (i)dct4x4dc, (i)dct4x4, (i)dct8x8, (i)dct_dc, zigzag_scan_frame_4x4 commit 6bf21c631a0cf073ad0503e6f3a9eeabacc5078a [revision 1226] Author: David Conrad Date: Sun Aug 23 01:35:10 2009 -0700 GSOC merge part 4: ARM NEON mc assembly functions prefetch, memcpy_aligned, memzero_aligned, avg, mc_luma, get_ref, mc_chroma, hpel_filter, frame_init_lowres commit 52f9719b4c3e58aaa6cbd6d83950444e022aefea [revision 1225] Author: David Conrad Date: Sat Aug 22 23:55:29 2009 -0700 GSOC merge part 3: ARM NEON pixel assembly functions SAD, SADX3/X4, SSD, SATD, SA8D, Hadamard_AC, VAR, VAR2, SSIM commit ca7da1aecdfdccaa4f7669e915348f6d31f85827 [revision 1224] Author: David Conrad Date: Sat Aug 22 23:40:33 2009 -0700 GSOC merge part 2: ARM stack alignment Neither GCC nor ARMCC support 16 byte stack alignment despite the fact that NEON loads require it. These macros only work for arrays, but fortunately that covers almost all instances of stack alignment in x264. commit 1a072a3a013976a178e0068be021e23b9a0ed59f [revision 1223] Author: David Conrad Date: Thu Aug 20 20:44:09 2009 -0700 Fix unaligned accesses in bitstream writer Fixes x264 on CPUs with no unaligned access support (e.g. SPARC). Improves performance marginally on CPUs with penalties for unaligned stores (e.g. some x86). commit 77c46ebc7d35de283fc27662e21d866be1b45773 [revision 1222] Author: Fiona Glaser Date: Thu Aug 20 13:08:25 2009 -0700 Fix bug in calculation of I-frame costs with AQ. commit fb62734c26f1a25f7009c9ec01849019ca454b4d [revision 1221] Author: David Conrad Date: Wed Aug 19 17:03:02 2009 -0700 GSOC merge part 1: Framework for ARM assembly optimizations x264 will detect which ARM core it's building for and only build NEON asm if the target is ARMv6 or above, then enable NEON at runtime. commit 8368e151c7b89c4a36cdc08b75ca490798d62c9d [revision 1220] Author: David Conrad Date: Wed Aug 19 16:18:36 2009 -0700 Fix a bug in checkasm and two OSX fixes MC chroma checkasm test could crash in some situations Remove -lmx, as it's not needed and the iPhone doesn't have it. Remove unused sqrtf emulation; it breaks if math.h is included. commit bde792fee0ac6d32d534918db870c94fe106b6e3 [revision 1219] Author: Fiona Glaser Date: Wed Aug 19 01:49:47 2009 -0700 Improve QPRD Always check the last macroblock's QP, even if the normal search doesn't reach it. Raise the failure threshold when moving towards the last macroblock's QP. 0.2-1% improved compression. commit 4e824bbcafaf16a4736db0028fdf6dd542f3ed35 [revision 1218] Author: Fiona Glaser Date: Tue Aug 18 21:53:28 2009 -0700 Fix MB-tree with keyint<3 Also slightly improve VBV keyint handling. commit 678b317aca6f75ecab89cf21c31e05748e9d2a5f [revision 1217] Author: Fiona Glaser Date: Tue Aug 18 19:25:45 2009 -0700 Fix bug in VBV lookahead + no MB-tree I-frames need to have VBV lookahead run on them as well. commit c83699f10f252998a42471294a8d97bb20f94296 [revision 1216] Author: Fiona Glaser Date: Tue Aug 18 18:37:26 2009 -0700 Add support for frame-accurate parameter changes Parameter structs can now be passed with individual frames. The previous method would only change the parameter of what was currently being encoded, which due to delay might be very far from an intended exact frame. Also add support for changing aspect ratio. Only works in a stream with repeating headers and requires the caller to force an IDR to ensure instant effect. commit 6a5a20431f448aaa43036cdaa024c8017d63fa04 [revision 1215] Author: Fiona Glaser Date: Tue Aug 18 15:46:26 2009 -0700 Fix x264_encoder_reconfig with multithreading New behavior: reconfigging the encoder will result in changes being applied to each of the encoding threads as they finish encoding the current frame. commit ba0c03511a7c8d6c8327c07b5a5870d4746be3eb [revision 1214] Author: Fiona Glaser Date: Sun Aug 16 03:29:49 2009 -0700 Fix two bugs in QPRD QPRD could in some cases force blocks to skip when they shouldn't be ~(+0.01db) Force QPRD to abide by qpmin/qpmax restrictions. commit 30a82c75c9bf38f47ab1dd1f505891505dda54da [revision 1213] Author: Fiona Glaser Date: Sat Aug 15 19:02:31 2009 -0700 Lookahead VBV Use the large-scale lookahead capability introduced in MB-tree for ratecontrol purposes. (Does not require MB-tree, however.) Greatly improved quality and compliance in 1-pass VBV mode, especially in CBR; +2db OPSNR or more in some cases. Fix some other bugs in VBV, which should improve non-lookahead mode as well. Change the tolerance algorithm in row VBV to allow for more significant mispredictions when buffer is nearly full. Note that due to the fixing of an extremely long-standing bug (>1 year), bitrates may change by nontrivial amounts in CRF without MB-tree. commit 50f7afcd0a21a01ac7aae72941747c2061db8d2e [revision 1212] Author: Fiona Glaser Date: Fri Aug 14 07:20:07 2009 -0700 Fix bug in b-adapt 1 B-adapt 1 didn't use more than MAX(1,bframes-1) B-frames when MB-tree was off. commit e586d699b2f13364aa443b367dba9fe38699f5de [revision 1211] Author: Fiona Glaser Date: Thu Aug 13 17:13:33 2009 -0700 Fix a potential failure in VBV If VBV does underflow, ratecontrol could be permanently broken for the rest of the clip. Revert part of the previous VBV changes to fix this. commit db724ac24e6a94c896e1bac6f4c7b5a5504ed773 [revision 1210] Author: Anton Mitrofanov Date: Thu Aug 13 21:40:21 2009 +0000 new API function x264_encoder_delayed_frames. fix x264cli on streams whose total length is less than the encoder latency. commit 9179e923c80d4720950ae90187bbcb2cd13f430a [revision 1209] Author: Fiona Glaser Date: Thu Aug 13 14:12:26 2009 -0700 Add no-mbtree to fprofile (and fix pyramid in fprofile) commit db12af7a44c498a75b6f3d72ec8836fb75050f14 [revision 1208] Author: Fiona Glaser Date: Sun Aug 9 16:06:52 2009 -0700 Don't print a warning about direct=auto in 2pass when B-frames are off commit f52973d2e93810bc86b6e8e3358d0365b97a409d [revision 1207] Author: Loren Merritt Date: Thu Aug 13 05:02:59 2009 +0000 fix lowres padding, which failed to extrapolate the right side for some resolutions. fix a buffer overread in x264_mbtree_propagate_cost_sse2. no effect on actual behavior, only theoretical correctness. fix x264_slicetype_frame_cost_recalculate on I-frames, which previously used all 0 mb costs. shut up a valgrind warning in predict_8x8_filter_mmx. commit e9ff8c4b1f647135f7b920fad69c616ccb08459a [revision 1206] Author: Loren Merritt Date: Sun Aug 9 04:00:36 2009 +0000 simd part of x264_macroblock_tree_propagate. 1.6x faster on conroe. commit 5599c4788e4ce72c04e536723075f7547deeaec3 [revision 1205] Author: Loren Merritt Date: Sat Aug 8 14:53:27 2009 +0000 MB-tree fixes: AQ was applied inconsistently, with some AQed costs compared to other non-AQed costs. Strangely enough, fixing this increases SSIM on some sources but decreases it on others. More investigation needed. Account for weighted bipred. Reduce memory, increase precision, simplify, and early terminate. commit efebe7d7b92678bfd9dacbf22068387aeff3da07 [revision 1204] Author: Fiona Glaser Date: Sat Aug 8 17:51:01 2009 -0700 Add missing free()s for new data allocated for MB-tree Eliminates a memory leak. commit 599f024c88b0978c892838c1af5a01ed1966a74f [revision 1203] Author: Fiona Glaser Date: Sat Aug 8 12:53:06 2009 -0700 Fix keyframe insertion with MB-tree and no B-frames commit 4cbc551150f9649b2e636e433af2204d353b3bc9 [revision 1202] Author: Fiona Glaser Date: Sat Aug 8 11:26:36 2009 -0700 Fix MP4 output (bug in malloc checking patch) commit 6eab44d4295801c28184ec13f03f9727c60129cc [revision 1201] Author: Steven Walters Date: Fri Aug 7 16:18:01 2009 -0700 Gracefully terminate in the case of a malloc failure Fuzz tests show that all mallocs appear to be checked correctly now. commit 7dec1a1574e6b94959d9bf3e997cea146289f9a7 [revision 1200] Author: Anton Mitrofanov Date: Fri Aug 7 10:44:13 2009 -0700 Fix a potential infinite loop in QPfile parsing on Windows ftell doesn't seem to work properly on Windows in text mode. commit 3667fbf979136302e990df35d850e05cf8de8115 [revision 1199] Author: Fiona Glaser Date: Fri Aug 7 10:31:16 2009 -0700 Fix delay calculation with multiple threads Delay frames for threading don't actually count as part of lookahead. commit 07178d3c8737aa9660d1ab11ace9c54bbe5724b6 [revision 1198] Author: Fiona Glaser Date: Thu Aug 6 23:09:46 2009 -0700 Add "veryslow" preset Apparently some people are actually *using* placebo, so I've added this preset to bridge the gap. commit 835ccc3cec908b1febfd31613d3e6583628116b3 [revision 1197] Author: Fiona Glaser Date: Tue Aug 4 17:46:33 2009 -0700 Macroblock-tree ratecontrol On by default; can be turned off with --no-mbtree. Uses a large lookahead to track temporal propagation of data and weight quality accordingly. Requires a very large separate statsfile (2 bytes per macroblock) in multi-pass mode. Doesn't work with b-pyramid yet. Note that MB-tree inherently measures quality different from the standard qcomp method, so bitrates produced by CRF may change somewhat. This makes the "medium" preset a bit slower. Accordingly, make "fast" slower as well, and introduce a new preset "faster" between "fast" and "veryfast". All presets "fast" and above will have MB-tree on. Add a new option, --rc-lookahead, to control the distance MB tree looks ahead to perform propagation analysis. Default is 40; larger values will be slower and require more memory but give more accurate results. This value will be used in the future to control ratecontrol lookahead (VBV). Add a new option, --no-psy, to disable all psy optimizations that don't improve PSNR or SSIM. This disables psy-RD/trellis, but also other more subtle internal psy optimizations that can't be controlled directly via external parameters. Quality improvement from MB-tree is about 2-70% depending on content. Strength of MB-tree adjustments can be tweaked using qcompress; higher values mean lower MB-tree strength. Note that MB-tree may perform slightly suboptimally on fades; this will be fixed by weighted prediction, which is coming soon. commit 93cc2893a9d4daf2d798f3cafddb499cabb3c0d7 [revision 1196] Author: Fiona Glaser Date: Mon Aug 3 20:52:30 2009 -0700 Various 1-pass VBV tweaks Make predictors have an offset in addition to a multiplier. This primarily fixes issues in sources with lots of extremely static scenes, such as anime and CGI. We tried linear regressions, but they were very unreliable as predictors. Also allow VBV to be slightly more aggressive in raising QPs to avoid not having enough bits left in some situations. Up to 1db improvement on some clips. commit 1d735afb29000c040daaf48040bdb4d3423f8352 [revision 1195] Author: Fiona Glaser Date: Tue Jul 28 20:41:27 2009 -0700 Fix another 10L in QPRD An entry in subpel_iterations was missing. I have no idea how QPRD was working at all without this change. commit cd707257878590367fb9adbe2403d11e53dc5ae3 [revision 1194] Author: Fiona Glaser Date: Tue Jul 28 01:16:23 2009 -0700 Update help and cleanup in ratecontrol.c Deal with some out-of-date information. commit b8c7499d4b83c920f80f4aa09072829cc034eb5f [revision 1193] Author: Loren Merritt Date: Tue Jul 28 07:16:31 2009 +0000 15% faster refine_bidir_satd, 10% faster refine_bidir_rd (or less with trellis=2) re-roll a loop (saves 44KB code size, which is the cause of most of this speed gain) don't re-mc mvs that haven't changed commit b08410d07ea242250fcb827742c74046d59bd991 [revision 1192] Author: Fiona Glaser Date: Mon Jul 27 21:03:00 2009 -0700 Faster bidir_rd plus some bugfixes Cache chroma MC during refine_bidir_rd and use both the luma and chroma caches to skip MC in macroblock_encode. Fix incorrect call to rd_cost_part; refine_bidir_rd output was incorrect for i8>0. Remove some redundant clips. ~12% faster refine_bidir_rd. commit 47c8783c667df1cd785cccc35752c4107fdb85dc [revision 1191] Author: Fiona Glaser Date: Mon Jul 27 04:45:03 2009 -0700 Add "fastdecode" tune option It does what it says it does. commit 9ea7b69df504b8990f339e2c8578a516f9df00c7 [revision 1190] Author: Fiona Glaser Date: Sun Jul 26 12:20:09 2009 -0700 Fix two bugs in QPRD fprofile settings now actually fprofile QPRD. Don't use i_mbrd before initializing it. commit 11f504412ecfc03274c9b3a6c05c0f05edf440ba [revision 1189] Author: Fiona Glaser Date: Sun Jul 26 03:03:12 2009 -0700 Fix 10l in QPRD Trellis used wrong lambda with trellis=1 commit fa3b8139a19d578c12c87e20a3215b41462866b4 [revision 1188] Author: Fiona Glaser Date: Sat Jul 25 22:31:06 2009 -0700 Fix a nondeterminism with threads and subme>7 Also add a few more checks to eliminate the need for spel_border. commit 4304c427fd6419b205c42aa139bfd8cebbdf60bf [revision 1187] Author: Fiona Glaser Date: Thu Jul 23 12:20:39 2009 -0700 Add QPRD support as subme=10 Refactor trellis lambda selection to be done in analyse_init instead of in trellis. This will allow for more easy adaption of lambda later on; for now it allows constant lambda across variable QPs. QPRD is only available with adaptive quantization enabled and generally improves SSIM and visual quality. Additionally, weight the SSD values from RD based on the relative QP offset for chroma; helps visually at high QPs where chroma has a lower QP than luma. This fixes some visual artifacts created by QPRD at high QPs. Note that this generally hurts PSNR and SSIM, and so is only on when psy-RD is on. commit d68f3b076acb1674c7cce95aaa2dc62372bbf7f4 [revision 1186] Author: Fiona Glaser Date: Tue Jul 21 19:56:21 2009 -0700 SSSE3 cachesplit workaround for avg2_w16 Palignr-based solution for the most commonly used qpel function. 1-1.5% faster overall on Core 2 chips. commit 9dfccce4181da4d4c1e61a707c338fc65310b1ad [revision 1185] Author: Loren Merritt Date: Wed Jul 22 20:20:52 2009 +0000 shut up valgrind warnings in trellis commit 2e1db1f6d5b52886b3c77338ed24096a188134b1 [revision 1184] Author: Anton Mitrofanov Date: Sat Jul 18 16:30:18 2009 -0700 New AQ algorithm option "Auto-variance" uses log(var)^2 instead of log(var) and attempts to adapt strength per-frame. Generates significantly better SSIM; on by default with --tune ssim. Whether it generates visually better quality is still up for debate. Available as --aq-mode 2. commit a79dc7b5bc6e95508d8456681c30b57605a05fd0 [revision 1183] Author: Fiona Glaser Date: Wed Jul 15 12:43:35 2009 -0700 Cacheline-split SSSE3 chroma MC ~70% faster chroma MC on 32-bit Conroe Also slightly faster SSSE3 intra_sad_8x8c commit 1921079dd03d36502308379c4437e4440a970473 [revision 1182] Author: Fiona Glaser Date: Sun Jul 12 12:07:01 2009 -0700 Improve documentation of qp/crf options commit bcf540a8f14595e7fa5dcdf2d53e321f46f1deeb [revision 1181] Author: Fiona Glaser Date: Thu Jul 9 19:02:57 2009 -0700 Merge array_non_zero into zigzag_sub Faster lossless, cleaner code. SSSE3 version of zigzag_sub_4x4_field, faster lossless interlaced coding. commit 5394872fcdde4c29a66d3e7902dca03c3b941947 [revision 1180] Author: James Darnley Date: Thu Jul 9 11:25:55 2009 -0700 Fix bug in reference frame autoadjustment For some types of input file, x264 did the adjustment before width/height were known. commit c08cdc866735168732d7cf731e5171e2f3ff04a9 [revision 1179] Author: Fiona Glaser Date: Tue Jul 7 11:13:39 2009 -0700 Fix fprofile settings to match changes in defaults Also add b-adapt 2 to fprofile. commit 1be01cb3fb9efec44be81714ae25bc272fe6c6cf [revision 1178] Author: Fiona Glaser Date: Fri Jul 3 02:33:44 2009 -0700 Slightly faster dequant_flat assembly Eliminate some redundant shifts. commit 71b9d885aacd1cc86851248af6824ed0cd965d98 [revision 1177] Author: Fiona Glaser Date: Wed Jul 1 21:14:57 2009 -0700 Totally new preset system for x264.c (not libx264), new defaults Other new features include "tune" and "profile" settings; see --help for more details. Unlike most other settings, "preset" and "tune" act before all other options. However, "profile" acts afterwards, overriding all other options. Our defaults have also changed: new defaults are --subme 7 --bframes 3 --8x8dct --no-psnr --no-ssim --threads auto --ref 3 --mixed-refs --trellis 1 --weightb --crf 23 --progress. Users will hopefully find these changes to greatly improve usability. commit 8878778c59c0417e40521e0b42412bc314ece487 [revision 1176] Author: Fiona Glaser Date: Wed Jul 1 16:33:12 2009 -0700 Update Gabriel's email address in AUTHORS commit 205a032c22467c90c26d33ed9ab23d60461e57c1 [revision 1175] Author: Fiona Glaser Date: Tue Jun 30 15:20:32 2009 -0700 Early termination for chroma encoding Faster chroma encoding by terminating early if heuristics indicate that the block will be DC-only. This works because the vast majority of inter chroma blocks have no coefficients at all, and those that do are almost always DC-only. Add two new helper DSP functions for this: dct_dc_8x8 and var2_8x8. mmx/sse2/ssse3 versions of each. Early termination is disabled at very low QPs due to it not being useful there. Performance increase is ~1-2% without trellis, up to 5-6% with trellis=2. Increase is greater with lower bitrates. commit 8a96d510fd0aef8ccf73717754482c03c4063c0d [revision 1174] Author: David Conrad Date: Fri Jun 26 13:09:44 2009 -0700 Fix bug in checkasm frame_init_lowres_core check didn't check the C plane. However, all x86 and PPC assembly was correct regardless of the unit test being incorrect. commit e0d1cad14c5251fd21aef99c92734d461200b779 [revision 1173] Author: Fiona Glaser Date: Wed Jun 24 14:39:15 2009 -0700 Add subpartition cost for sub-8x8 blocks Improves sub-p8x8 mode decision. commit 1b3a43306c8a8efa9e45380d58c1acd488069c2a [revision 1172] Author: Fiona Glaser Date: Wed Jun 24 13:24:18 2009 -0700 Yet more CABAC and CAVLC optimizations Also clean up a lot of pointless code duplication in CAVLC MV coding. commit 90bec46ba524c3e1a4facaeb3ea21b9ef08e614b [revision 1171] Author: Fiona Glaser Date: Fri Jun 19 18:49:55 2009 -0700 Various CABAC optimizations and cleanups Faster CABAC CBF context calculation for inter blocks. Add x264_constant_p(), will probably be useful in the future as well. Simpler subpartition functions. Clean up and optimize mvd_cpn a bit more. Various other minor optimizations. commit 3a61047871d39ddaecfb58f78ce5235ca9786a2d [revision 1170] Author: David Wolstencroft Date: Sat Jun 20 21:42:55 2009 +0200 AltiVec version of frame_init_lowres_core. 22.4x faster than C on PPC7450 and 25x on PPC970MP. commit 42e179e84b8563eff62efcfbee0d947f09100fd4 [revision 1169] Author: Fiona Glaser Date: Fri Jun 19 16:03:18 2009 -0700 MMX CABAC mvd sum calculation Faster CABAC mvd coding. commit 46b107980bdc234c3bff9aae10d99e7b65551426 [revision 1168] Author: Fiona Glaser Date: Fri Jun 19 16:02:39 2009 -0700 Faster MV prediction Smaller code size, plus I get to use goto. commit 84fc0be90329bffd3c3f4515463cc7348bccc366 [revision 1167] Author: Fiona Glaser Date: Wed Jun 10 10:37:01 2009 -0700 Fix potential crash in checkasm ssim_end4_sse2 requires aligned sums commit 892dad35970375e99da6e047f677964b8eb69fc8 [revision 1166] Author: Fiona Glaser Date: Wed Jun 10 10:11:00 2009 -0700 SSSE3, faster SSE2/MMX integral_init4v The real reason I wrote this was an excuse to use shufpd. commit ebd85507c8c6eadb77c360ce3966a6ad4a5341d9 [revision 1165] Author: Mike Frysinger Date: Thu Jun 11 08:29:27 2009 +0000 configure check for uclinux commit b67ef31c400d107e52a7592ef19a3f62b6267920 [revision 1164] Author: Loren Merritt Date: Thu Jun 11 08:27:46 2009 +0000 fix a crash on frame width <= 48 pixels commit 20889345f7d6c13a8628e45f54c48df3e6793f97 [revision 1163] Author: Loren Merritt Date: Wed May 27 20:47:18 2009 +0000 configure check for cc, rather than reporting lack of compiler as an asm error. configure check for -mno-cygwin, since it's removed from gcc4. commit 3e6b5309229856eb80d7dde016cc33ac9afa5869 [revision 1162] Author: Loren Merritt Date: Sun May 24 05:01:26 2009 +0000 a better way to keep track of mv candidates. 2-4% faster dia, hex, and umh. commit 803482488c0d220929e5338b76249b511a034204 [revision 1161] Author: Loren Merritt Date: Sun May 24 05:01:19 2009 +0000 reorder some motion estimation patterns. this change is useless on its own, but segregates the bitstream-changing part out of my next optimization. commit b53f25fa3ff4a7445386c926eb018b0cb630f59e [revision 1160] Author: Loren Merritt Date: Mon May 25 19:16:05 2009 -0400 Fix VBV warning broken in r915 x264 will now correctly warn about maxrate specified without bufsize even when a level is not set. commit ba39abd8f25ac9a094bbd9e89692e9f52de9d1ce [revision 1159] Author: Loren Merritt Date: Mon May 25 07:03:10 2009 +0000 configure check for ssse3-capable binutils commit eb3759477da1397153a6be504d1caf45eea8a080 [revision 1158] Author: Fiona Glaser Date: Sun May 24 16:58:08 2009 -0400 Fix 10L in r1155 Broke --me esa/tesa due to forgetting to add handling for x264_cost_mv_fpel. commit ded0dcd5806440eda4f7ffb072f8e13f8b185171 [revision 1157] Author: Fiona Glaser Date: Fri May 22 21:28:15 2009 -0700 Fix bug where satd was incorrectly used with subme<=1 Faster subme<=1 with i4x4 enabled. commit 4078e4be9a3640c9f5b33a6734071c7009cf96f7 [revision 1156] Author: Fiona Glaser Date: Fri May 22 20:40:27 2009 -0700 Remove some pointless error handling code in cabac/cavlc commit 1aed7cd36955e1dcd2ed3e5cd1605b0978e7e9c1 [revision 1155] Author: Fiona Glaser Date: Fri May 22 18:40:12 2009 -0700 Save some memory on mv cost arrays Have quantizers that use the same lambda share the same cost array. commit d6261b812226bc61c0a55531501c51ea172cda9c [revision 1154] Author: Fiona Glaser Date: Fri May 22 16:57:33 2009 -0700 Various CABAC and CAVLC optimizations Backport CAVLC partial-inlining early termination to CABAC (~2-4% faster CABAC residual coding) commit 303e985d09f6562cf3a52327d30e3120fa481008 [revision 1153] Author: Loren Merritt Date: Tue May 19 02:47:15 2009 +0000 fix a race condition at the end of thread_input commit 17b86284821abc7762eea22dfcf1a72e58c6b0e8 [revision 1152] Author: Fiona Glaser Date: Mon May 18 22:40:45 2009 -0400 Various trellis speed optimizations commit 8dbe5d467d218484376650b1349b5350639ea5fb [revision 1151] Author: Fiona Glaser Date: Sat May 16 12:16:34 2009 -0700 Make i686 the default arch on x86_32 Disabling asm will default to a generic arch. Also fix configure for gcc 4.4. commit 39f9a29f31293098fceed3e7d07bc860bc03b6ad [revision 1150] Author: Fiona Glaser Date: Fri May 15 20:07:59 2009 -0700 Faster signed golomb coding 3% faster CAVLC RDO and bitstream writing. commit ba5ef93da41010d213a81d4f4c0d4db8e6fcc2d6 [revision 1149] Author: Fiona Glaser Date: Thu May 14 04:11:15 2009 -0700 Faster spatial direct MV prediction unroll/tweak col_zero_flag commit 094a4edf89facbfbba50a7578fb824bace9eaebe [revision 1148] Author: Fiona Glaser Date: Mon May 4 04:19:28 2009 -0700 More CABAC and CAVLC optimizations Simplified function calling for block_residual_write_(cabac|cavlc) and improved sigmap coding. Tried making 0/1-bit specific versions of CABAC asm, but benefit was minimal under GCC 4.3. Helped a decent bit under 3.4, but you shouldn't be using such old versions anyways. commit a61eab5a3d14981162805cb279c12d4ccf6302d4 [revision 1147] Author: Fiona Glaser Date: Wed Apr 29 22:54:52 2009 -0700 Various optimizations in frametype lookahead commit 1f57251003aa2fa82000ba86fbb04d6911505bd8 [revision 1146] Author: Fiona Glaser Date: Sun Apr 26 22:13:17 2009 -0700 Some cosmetics/cleanup Move some macros to x86util.asm that should have been there to begin with. Fix a typo that didn't cause any issues. commit 57505e301e81ea6fbeaa1a5503f05250335ab1d1 [revision 1145] Author: Guillaume Poirier Date: Tue Apr 21 21:18:44 2009 +0000 fix "incompatible types in initialization" compilation issues with GCC 4.3 (which is stricter than previous compiler version) commit b8745339e244e3b404a0023fb1d106fdccde509c [revision 1144] Author: Guillaume Poirier Date: Tue Apr 21 17:32:21 2009 +0200 fix conversions between vectors with differing element types or numbers of subparts errors commit 448ea68827a3e16fd7c8c90880fefe1d85a17c5a [revision 1143] Author: Fiona Glaser Date: Sat Apr 18 16:07:53 2009 -0700 Add "coded blocks" stat to output information. This measures the total percentage of blocks, intra and inter, which have nonzero coefficients. "y,uvAC,uvDC" refers to luma, chroma DC, and chroma AC blocks. Note that skip blocks are included in this stat. commit 6eb29353d0a64b719305555d0c2c0727e0efa797 [revision 1142] Author: Fiona Glaser Date: Fri Apr 17 23:38:29 2009 -0700 Enable asm predict_8x8_filter I'm not entirely sure how this snuck its way out of holger's intra pred patch. commit 840f7a5e6322c5598bd9801f06d0ed83f83fbe41 [revision 1141] Author: Fiona Glaser Date: Fri Apr 17 06:00:39 2009 -0700 Remove various bits of dead code found by CLANG. commit 6217838338477b4d37110398e86f5031790ae703 [revision 1140] Author: Fiona Glaser Date: Tue Apr 14 14:47:02 2009 -0700 Slightly faster SSE4 SA8D, SSE4 Hadamard_AC, SSE2 SSIM shufps is the most underrated SSE instruction on x86. commit 2bcc39fd4cb14bb5d8776d2dc560ebdce4eaf20a [revision 1139] Author: Fiona Glaser Date: Thu Apr 9 02:14:41 2009 -0700 Various CABAC optimizations Move calculation of b_intra out of the core residual loop and hardcode it where applicable. Inlining cabac_mb_mvd was unnecessary and wasted tremendous amounts of code size. Inlining only cache_mvd is faster and significantly smaller. commit bf749f764aa24ad77502907eaeb1ba9e0d82d035 [revision 1138] Author: Fiona Glaser Date: Wed Apr 8 05:45:03 2009 -0700 CAVLC optimizations faster bs_write_te, port CABAC context selection optimization to CAVLC. commit be3c3d21a188ed5e96d1ed146a282f156be4b677 [revision 1137] Author: Fiona Glaser Date: Sun Apr 5 13:01:42 2009 -0700 Faster CABAC RDO Since the bypass case is quite unlikely, especially when doing merged sigmap/level coding, it's faster to use a branch than a cmov. commit 18494e61ce99907a8826bd45eba75a88e6762fea [revision 1136] Author: Fiona Glaser Date: Tue Mar 31 10:36:57 2009 -0700 Activate intra_sad_x3_8x8c in lookahead commit c8fb152fd1debce5bf88173fb4b794c6b006099e [revision 1135] Author: Fiona Glaser Date: Tue Mar 31 10:34:35 2009 -0700 MBAFF interlaced coding is not allowed in baseline profile commit 55ccc4ef93952285ac0d609015751110b111e2a8 [revision 1134] Author: Fiona Glaser Date: Mon Mar 30 19:30:59 2009 -0700 intra_sad_x3_8x8 assembly commit 104511d6e13a2d6628ba321fe7f0cb25ac545b6f [revision 1133] Author: Fiona Glaser Date: Mon Mar 30 16:37:46 2009 -0700 intra_sad_x3_4x4 assembly commit 82aef940468385dbff6e32b77477a0c80124aca9 [revision 1132] Author: Fiona Glaser Date: Mon Mar 30 04:07:50 2009 -0700 intra_sad_x3_8x8c assembly Also fix intra_sad_x3_16x16's use of "n" as a loop variable (broke SWAP) commit 291b6ab1cb56c15a0169312b1c7ee8be7a1b594b [revision 1131] Author: Fiona Glaser Date: Sun Mar 29 18:27:32 2009 -0700 Shave one instruction off CABAC encode_decision range_lps>>6 ranges from 4-7, so (range_lps>>6)-4 == (range_lps>>6) & 3 commit a937afbe27515379f40085e6c663b6f6bc4c5191 [revision 1130] Author: Fiona Glaser Date: Thu Mar 26 22:22:23 2009 -0700 Faster probe_skip Add a second chroma threshold after the DC transform. commit 861d0b1c22de140724c91fe181208ec9debf848f [revision 1129] Author: Fiona Glaser Date: Thu Mar 19 12:28:21 2009 -0700 Add missing "static" qualifier to two arrays Should slightly improve performance. commit d25d50c9ffb02571c12e13c09356fa08fe97b0b4 [revision 1128] Author: Fiona Glaser Date: Tue Mar 17 11:01:57 2009 -0700 SSE2 zigzag_interleave Replace PHADD with FastShuffle (more accurate naming). This flag represents asm functions that rely on fast SSE2 shuffle units, and thus are only faster on Phenom, Nehalem, and Penryn CPUs. commit acd4b2641c662bbe29795e986c58a3c47de675e9 [revision 1127] Author: Fiona Glaser Date: Mon Mar 9 23:37:53 2009 -0700 Faster integral_init palignr to avoid unaligned loads is worth it in inith, but not initv. commit 1b627cce8226a45980ac0b8fa70aa3a85ad5617f [revision 1126] Author: Holger Lubitz Date: Mon Mar 9 14:05:16 2009 -0700 Faster SSSE3 hpel_filter_v ~10% faster hpel_filter on 64-bit Penryn. 32-bit version by Fiona Glaser. commit 4030a8bdc18e2380eabd921d7cf559b40f047013 [revision 1125] Author: Fiona Glaser Date: Sat Mar 7 16:43:09 2009 -0800 Faster SSE2 pixel_var Optimized using the DEINTB method from r1122. ~32% faster var_16x16 on Conroe. commit f701ebc84812eeab34735964a84f706ef2aa9625 [revision 1124] Author: Fiona Glaser Date: Sat Mar 7 00:27:27 2009 -0800 SSSE3 hpel_filter_v Optimized using the same method as in r1122. Patch partially by Holger. ~8% faster hpel filter on 64-bit Nehalem commit 936f76e00fb4eb35efeb1a505dd6b935d1cc3199 [revision 1123] Author: Fiona Glaser Date: Fri Mar 6 18:57:15 2009 -0800 Update some asm copyright headers commit 54e38917b413e80b474d3ed7ba344e7c489b020c [revision 1122] Author: Holger Lubitz Date: Fri Mar 6 18:16:30 2009 -0800 Vastly faster SATD/SA8D/Hadamard_AC/SSD/DCT/IDCT Heavily optimized for Core 2 and Nehalem, but performance should improve on all modern x86 CPUs. 16x16 SATD: +18% speed on K8(64bit), +22% on K10(32bit), +42% on Penryn(64bit), +44% on Nehalem(64bit), +50% on P4(32bit), +98% on Conroe(64bit) Similar performance boosts in SATD-like functions (SA8D, hadamard_ac) and somewhat less in DCT/IDCT/SSD. Overall performance boost is up to ~15% on 64-bit Conroe. commit 7501d9505a10d17d8cc238fd87af6330d2c1804c [revision 1121] Author: Fiona Glaser Date: Fri Mar 6 15:28:47 2009 -0800 Update x264 copyright date commit 79704fa50d50a6ae756643ad69f0170e5af831fd [revision 1120] Author: Fiona Glaser Date: Wed Mar 4 03:16:06 2009 -0800 Remove pre-scenecut from fprofile commands as well Also add psy-trellis to fprofile commit b77ea4db6faa06d9120defe6fa1a5f6803d224d4 [revision 1119] Author: Fiona Glaser Date: Tue Mar 3 16:21:52 2009 -0800 Slightly faster 8x16 SAD on Penryn Core 2 Same as MMX 8x16 cacheline SAD, but calls SSE2 8x16 SAD in non-cacheline case. Only Nehalem benefits from sizes smaller than 8x16, and Nehalem doesn't use cacheline functions, so no smaller versions are included. commit dfe8f732e6c0b4d97218b8417bda8034524eecb8 [revision 1118] Author: Fiona Glaser Date: Thu Feb 26 19:50:09 2009 -0800 Fix scenecut and VBV with videos of width/height <= 32 Also remove an unused variable commit 42f27d04b8fe0f9fb7e978edd38252d9d8a5af3d [revision 1117] Author: Fiona Glaser Date: Thu Feb 26 14:29:50 2009 -0800 Remove non-pre scenecut Add support for no-b-adapt + pre-scenecut (patch by BugMaster) Pre-scenecut was generally better than regular scenecut in terms of accuracy and regular scenecut didn't work in threaded mode anyways. Add no-scenecut option (scenecut=0 is now no scenecut; previously it was -1) Fix an incorrect bias towards P-frames near scenecuts with B-adapt 2. Simplify pre-scenecut code. commit 2d5dcf8c216cdf053ad55b29a60f941b055d2325 [revision 1116] Author: Guillaume Poirier Date: Tue Mar 3 07:44:18 2009 -0800 Add AltiVec version of hadamard_ac. 2.4x faster than the C version. Note this this implementation is pretty naive and should be improved by implementing what's discussed in this ML thread: date: Mon, Feb 2, 2009 at 6:58 PM subject: Re: [x264-devel] [PATCH] AltiVec implementation of hadamard_ac routines commit 2669f7ddf56b34240248ea02a8c7f8309e2b4610 [revision 1115] Author: Fiona Glaser Date: Thu Feb 26 12:07:56 2009 -0800 Fix regression in r1085 Deblocking was very slightly incorrect with partitions=all. Bug found by BugMaster. commit 56967517b7003192e9ac9e3110d566b2a05839f9 [revision 1114] Author: Fiona Glaser Date: Mon Feb 16 05:56:12 2009 -0800 Optimize neighbor CBP calculation and fix related regression r1105 introduced array overflow in cbp handling commit ce4de643fc96b89148a99281d3ef2edbf03e72f9 [revision 1113] Author: Tal Aloni Date: Fri Feb 13 16:30:14 2009 -0800 Show FPS when importing a raw YUV file commit c6e72b86ece1b49126f1d53ff67df7e0f6f85148 [revision 1112] Author: Anton Mitrofanov Date: Wed Feb 11 10:38:56 2009 -0800 Windows 64-bit support A "make distclean" is probably required after updating to this revision. commit ef48e51d8f1cdc6d9ef30d3e3a1455d91a13d0f0 [revision 1111] Author: Fiona Glaser Date: Wed Feb 11 10:35:56 2009 -0800 Minor fixes and cosmetics Suppress a GCC warning, fix a non-problematic array overflow, one REP->REP_RET. commit 65304078db6e69f7e47505c0518c6a913cf2bc9f [revision 1110] Author: Manuel Rommel Date: Tue Feb 10 12:06:47 2009 -0800 fix 10l in 75b495f2723fcb77f Original thread: date: Mon, Feb 9, 2009 at 9:37 PM subject: [x264-devel] commit: Spare a vec_perm and a vec_mergeh though using a LUT of permutation vectors . (Guillaume Poirier ) commit f34ce950a7d0eb89adb052f0f96e36a55c587dde [revision 1109] Author: Guillaume Poirier Date: Mon Feb 9 21:17:33 2009 +0100 Spare a vec_perm and a vec_mergeh though using a LUT of permutation vectors. commit 3c0cb9f0dd0730bebde169e29dd766ea56065c3a [revision 1108] Author: Guillaume Poirier Date: Mon Feb 9 21:12:23 2009 +0100 Promote chroma planes to 16 byte alignment. This will allow simplifying vectors loads that can only load 16-bytes aligned data (such as AltiVec). commit 0f386fe1fb626e654b7ca94fb27fbd727c0b4e97 [revision 1107] Author: Fiona Glaser Date: Mon Feb 9 11:30:54 2009 -0800 Fix 10L in intra pred Forgetting a %define resulted in SIGILL on 32-bit systems without SSE (e.g. Athlon XP). commit 6dc8b9ad888faf6da8d88d5a2a82f39636a94fef [revision 1106] Author: Fiona Glaser Date: Sun Feb 8 23:36:40 2009 -0800 Add decimation in i16x16 blocks Up to +0.04db with CAVLC, generally a lot less with CABAC. commit c656d68ff441d2925afcb40ccfaf49279fd95656 [revision 1105] Author: Fiona Glaser Date: Sat Feb 7 02:27:16 2009 -0800 Much faster CABAC residual context selection Up to ~17% faster CABAC RDO, ~36% faster intra-only CABAC RDO. Up to 7% faster overall in extreme cases. commit 5a7a1d14e461c431370d9111f3e6eb4efc15737f [revision 1104] Author: Fiona Glaser Date: Sat Feb 7 01:57:43 2009 -0800 Faster coeff_last64 on 32-bit commit 0743869d0743d47dc93c22b6d55ca84e1851ebc2 [revision 1103] Author: Fiona Glaser Date: Fri Feb 6 02:59:36 2009 -0800 More intra pred asm optimizations SSSE3 version of predict_8x8_hu SSE2 version of predict_8x8c_p SSSE3 versions of both planar prediction functions Optimizations to predict_16x16_p_sse2 Some unnecessary REP_RETs -> RETs. SSE2 version of predict_8x8_vr by Holger. SSE2 version of predict_8x8_hd. Don't compile MMX versions of some of the pred functions on x86_64. Remove now-useless x86_64 C versions of 4x4 pred functions. Rewrite some of the x86_64-only C functions in asm. commit 3c5cb4f10833c84fdce192c01c92b1a15145c85b [revision 1102] Author: Manuel Rommel Date: Sun Feb 8 21:35:51 2009 +0100 Speed-up mc_chroma_altivec by using vec_mladd cleverly, and unrolling. Also put width == 2 variant in its own scalar function because it's faster than a vectorized one. commit 5f5fa1e9dc6a7dd51fa6c2da243e27fae845887d [revision 1101] Author: Holger Lubitz Date: Wed Feb 4 12:46:17 2009 -0800 Merging Holger's GSOC branch part 2: intra prediction Assembly versions of most remaining 4x4 and 8x8 intra pred functions. Assembly version of predict_8x8_filter. A few other optimizations. Primarily Core 2-optimized. commit d35fe864df3c5dcff7a97cf4cc9ec8cd70f6ccb1 [revision 1100] Author: Guillaume Poirier Date: Wed Feb 4 10:04:55 2009 +0000 10l: fix compilation with GCC 4.3+ commit ded3e28cf1f593cbd1ad7c5255ba4ec82635574c [revision 1099] Author: Fiona Glaser Date: Sat Jan 31 05:00:39 2009 -0800 Faster 8x8dct+CAVLC interleave Integrate array_non_zero with the CAVLC 8x8dct interleave function. Roughly 1.5-2x faster than the original separate array_non_zero method. commit 741e1f99b988390a7016c9e6d5b9cac01a1eab87 [revision 1098] Author: Fiona Glaser Date: Sat Jan 31 01:00:26 2009 -0800 Measure CBP cost in i8x8 RD refinement ~0.02-0.05db PSNR gain at high quants in intra-only encoding, pretty small otherwise. Allows a small optimization in i8x8 encoding. commit 3c5f281ec05ef563e2371083105a10c2c2a84c2a [revision 1097] Author: Guillaume Poirier Date: Sun Feb 1 20:58:00 2009 +0100 Take advantage of saturated signed horizontal sum instructions in the variance computation epilogue since there won't be any overflow triggering an overflow. Suggested by Loren Merritt commit e394bd600ba9b1a3cee24e7d0b01dfb0acc5d1ad [revision 1096] Author: Fiona Glaser Date: Fri Jan 30 03:40:54 2009 -0800 Massive overhaul of nnz/cbp calculation Modify quantization to also calculate array_non_zero. PPC assembly changes by gpoirior. New quant asm includes some small tweaks to quant and SSE4 versions using ptest for the array_non_zero. Use this new feature of quant to merge nnz/cbp calculation directly with encoding and avoid many unnecessary calls to dequant/zigzag/decimate/etc. Also add new i16x16 DC-only iDCT with asm. Since intra encoding now directly calculates nnz, skip_intra now backs up nnz/cbp as well. Output should be equivalent except when using p4x4+RDO because of a subtlety involving old nnz values lying around. Performance increase in macroblock_encode: ~18% with dct-decimate, 30% without at CRF 25. Overall performance increase 0-6% depending on encoding settings. commit 9c55521590a2afe496394e51f6a42dc30939f8ad [revision 1095] Author: Guillaume Poirier Date: Thu Jan 29 01:28:12 2009 -0800 Add PowerPC support for "checkasm --bench", reading the time base register. This isn't ideal since the `time base' register is running at a fraction of the processor cycle speed, so the measurement isn't as precise as x86's rdtsc. It's better than nothing though... commit bf81694e7e38723513740d5d312a5866a5b59215 [revision 1094] Author: Brad Smith Date: Thu Jan 29 04:35:34 2009 +0000 fix detection of pthread and isfinite on OpenBSD commit 6ec3ec06cc7433a2cae2c84eb8eaa450506b5cdf [revision 1093] Author: Loren Merritt Date: Tue Jan 27 05:42:51 2009 +0000 remove $ECHON kludge, which broke on SunOS. bring back `gcc -MT`. remove auto-reconfigure on svn update, which has done nothing since we stopped using svn. fix $AS on sparc (was disabled by mmx check). fix --extra-asflags (was ignored). mark bash scripts as bash, not sh patch partly by Greg Robinson and Jugdish. commit 0e43d5d995bb436a63934d70792e481770f406d3 [revision 1092] Author: Loren Merritt Date: Mon Jan 26 14:28:48 2009 +0000 1.6x faster satd_c (and sa8d and hadamard_ac) with pseudo-simd. 60KB smaller binary. commit d4ca70f8398bdba2391fbcea4886ee0577494b08 [revision 1091] Author: Fiona Glaser Date: Tue Jan 27 23:27:56 2009 -0800 Hack around a potential failure point in VBV pred_b_from_p can become absurdly large in static scenes, leading to rare collapses of quality with VBV+B-frames+threads. This isn't a final fix, but should resolve the problem in most cases in the meantime. commit 83d805fe95b5dcf0493ecf0efa77ac6d0bc43a1d [revision 1090] Author: Fiona Glaser Date: Mon Jan 26 23:43:25 2009 -0800 Much faster chroma encoding and other opts ~15% faster chroma encode by reorganizing CBP calculation and adding special-case idct_dc function, since most coded chroma blocks are DC-only. Small optimization in cache_save (skip_bp) Fix array_non_zero to not violate strict aliasing (should eliminate miscompilation issues in the future) Add in automatic substitutions for some asm instructions that have an equivalent smaller representation. commit 360946d0f56a26b8b46c81088426557a628513cc [revision 1089] Author: Guillaume Poirier Date: Mon Jan 26 06:28:23 2009 -0800 add AltiVec implementation of x264_mc_copy_w16_aligned commit 521bbdd13131a8e95de99779182cfe7b5fa9ecd1 [revision 1088] Author: Guillaume Poirier Date: Fri Jan 23 13:53:06 2009 -0800 add AltiVec implementation of x264_pixel_var_16x16 and x264_pixel_var_8x8 commit 9e5d49f0b1599035b42d0fc0385e2d52e7f43be1 [revision 1087] Author: Guillaume Poirier Date: Fri Jan 23 01:11:20 2009 -0800 add AltiVec 16 <-> 32 bits conversions macros commit 7a1bfdd1f11a2da2ec6b9c473ed0baf9047a5460 [revision 1086] Author: Guillaume Poirier Date: Mon Jan 19 21:29:27 2009 +0100 Replace 16x16=>32 mul + pack + add by a simple 16x16=>16 multiply-add. Suggested by Loren. commit 1f0e78d8ea5b0d260f8497d4817b1962f6b0894d [revision 1085] Author: Fiona Glaser Date: Mon Jan 19 15:17:53 2009 -0800 Eliminate support for direct_8x8_inference=0 The benefit in the most extreme contrived situation was at most 0.001db PSNR, at the cost of slower decoding. As this option was basically useless, it was a waste of code and prevented some other useful optimizations. Remove some unused mc code related to sub-8x8 partitions. Small deblocking speedup when p4x4 is used. Also remove unused x264_nal_decode prototype from x264.h. commit 71e87faecf863ed7776e8d8c7eb339bdd7842877 [revision 1084] Author: Brad Smith Date: Mon Jan 19 05:14:53 2009 -0800 Add AltiVec and CPU numbers detection on OpenBSD. commit 7aa5a4e694f38aac7f217baa8998f197be77cbec [revision 1083] Author: Guillaume Poirier Date: Sun Jan 18 22:44:14 2009 +0100 Add AltiVec implementation of predict_8x8c_p. 2.6x faster than scalar C. commit 8e485c6d2bb463d44ac6047f018c272343d07e17 [revision 1082] Author: Fiona Glaser Date: Sat Jan 17 15:16:37 2009 -0500 Warn if direct auto wasn't set on the first pass And, if it wasn't, run direct auto as if it was the first pass, rather than simply forcing temporal direct mode on all frames. Also a small tweak to coeff_level_run asm. commit 0f822746e1427f8b237f4dc368d9ea2271b20644 [revision 1081] Author: Brad Smith Date: Sat Jan 17 12:52:28 2009 +0000 Changes the PowerPC ppccommon.h header so it no longer checks for a particular OS such as Linux but instead looks for HAVE_ALTIVEC_H being set. Fixes all *BSD/PowerPC builds. commit da9787a42606ddf7c211e9860bd6a585fbe8a803 [revision 1080] Author: Guillaume Poirier Date: Wed Jan 14 21:56:31 2009 +0100 update x264_hpel_filter_altivec's prototype to match the one of the C version. It changed in commit 045ae4045a1827555b3eaab4fbf3c9809e98c58f (factorization of mallocs) (NB: Altivec implementation wasn't allocating and writing to any scratch memory.) commit ed91c877df9ffcfceba8a387e7f5bf9302dc2276 [revision 1079] Author: Guillaume Poirier Date: Wed Jan 14 21:49:42 2009 +0100 rename vector+array unions to closer match the vector typedefs names. commit 7f0dc1a6048fabfed78ea8d29fde226453aad328 [revision 1078] Author: Guillaume Poirier Date: Wed Jan 14 21:13:58 2009 +0100 Add Altivec implementation of all the remaining 16x16 predict routines. commit 7ecbd9ea21867a269cb7a2afa0984d8c0bb6aa0e [revision 1077] Author: Fiona Glaser Date: Tue Jan 13 21:11:50 2009 -0500 Cache ref costs and use more accurate MV costs New MV costs should improve quality slightly by improving the smoothness of the field of MV costs (and they're closer to CABAC's actual costs). Despite being optimized for CABAC, they still help under CAVLC, albeit less. MV cost change by Loren Merritt commit 6b4b85f1e5d26a0314bbb621c3e72fb4bd43bfc6 [revision 1076] Author: Fiona Glaser Date: Tue Jan 13 20:22:36 2009 -0500 Support forced frametypes with scenecut/b-adapt This allows an input qpfile to be used to force I-frames, for example. The same can be done through the library interface. Document the format of the qpfile in --longhelp and the forcing of frametypes in x264.h Note that forcing B-frames and B-refs may not always have the intended result. Patch partially by Steven Walters . commit 3a2a2a4c29a5c835f97498885754f2be37617b22 [revision 1075] Author: Fiona Glaser Date: Tue Jan 13 19:58:44 2009 -0500 Remove an IDIV from i8x8 analysis Only one IDIV is left in macroblock level code (transform_rd) commit d7d1d37f7eb27940001ff436666da3744a1236be [revision 1074] Author: Fiona Glaser Date: Thu Jan 8 15:07:16 2009 -0500 Fix regression in r1066 With some combinations of video width and other settings, the scratch buffer was slightly too small. This caused heap corruption on some systems. Also prevent merange from being raised during encoding with esa/tesa through encoder_reconfig, as this no longer works. commit d52d44b319c30142903fceb09c52c9c8b64f22da [revision 1073] Author: Fiona Glaser Date: Tue Jan 6 16:55:44 2009 -0500 Disable B-frames in lossless mode They hurt compression anyways, and direct auto was bugged with lossless. commit cac42177137a54acfa6bf8b1368a9978fc9cf562 [revision 1072] Author: Brad Smith Date: Mon Jan 5 22:53:11 2009 +0000 Factorize in ppccommon.h the conditional inclusion of altivec.h on Linux systems. commit 438ca2d81828f3fcc72b25a792d64b3649627a42 [revision 1071] Author: Brad Smith Date: Mon Jan 5 15:58:32 2009 -0500 Disable __builtin_clz() intrinsic on gcc versions prior to 3.4. The function did not exist before that version. commit d1f4f0c7cd3502acdda273df55143de701cebc6a [revision 1070] Author: Fiona Glaser Date: Thu Jan 1 21:44:00 2009 -0500 Small tweaks to coeff asm Factor out a few redundant pxors Related cosmetics commit 6c15d57cfc40e0d1bd3529cdc82f2e8ac92734fb [revision 1069] Author: Steven Walters Date: Tue Dec 30 22:20:37 2008 -0500 Use the correct strtok under MSVC Also change one malloc -> x264_malloc commit 30b14a489b3983426dfb2beceb6f76cc485067c6 [revision 1068] Author: Fiona Glaser Date: Tue Dec 30 22:14:45 2008 -0500 Add stack alignment for lookahead functions Should allow libx264 to be called from non-gcc-compiled applications without adding force_align_arg_pointer. commit cb688111fb28225a4d1fe2a45472ac0cd093a08f [revision 1067] Author: Fiona Glaser Date: Tue Dec 30 20:47:45 2008 -0500 Add support for SSE4a (Phenom) LZCNT instruction Significantly speeds up coeff_last and coeff_level_run on Phenom CPUs for faster CAVLC and CABAC. Also a small tweak to coeff_level_run asm. commit 9e1f300078a010d07a7331c796d08d78c624c772 [revision 1066] Author: Steven Walters Date: Mon Dec 29 05:14:26 2008 +0000 factor mallocs out of hpel, ssim, and esa. there should now be no memory allocation outside of init-time. commit ffd73767089b2db2ca9a06891e10883fe2bcb3e2 [revision 1065] Author: Fiona Glaser Date: Tue Dec 30 03:12:17 2008 +0000 Much faster CAVLC RDO and bitstream writing Pure asm version of level/run coding. Over 2x faster than C. Up to 40% faster CAVLC RDO. Overall benefit up to ~7.5% with RDO or ~5% with fast encoding settings. commit f33ba9e2bc8d344c515f2d5f662958d323f3a074 [revision 1064] Author: Loren Merritt Date: Mon Dec 29 21:52:25 2008 -0500 Cosmetics: cleaner syntax for defining temporary registers in asm Globally define t#[qdwb], so that only t# needs to be locally defined when reorganizing registers commit 406a40dc41438edac3f60d231eb9196b3d33008f [revision 1063] Author: Fiona Glaser Date: Sat Dec 27 21:36:14 2008 -0500 Much faster CABAC RDO Since RDO doesn't care about what order bit costs are calculated, merge sigmap and level coding into the same loop in RDO. This is bit-exact for 4x4dct but slightly incorrect for 8x8dct due to the sigmap containing duplicated contexts. However, the PSNR penalty of this is extremely small (~0.001db). Speed benefit is about 15% in 4x4dct and 30% in 8x8dct residual bit cost calculation at QP20. Overall encoding speed benefit is up to 5%, depending on encoding settings. Also remove an old unnecessary CABAC table that hasn't been used for years. commit 131d066e4c79f4fe29ce9e70926ffd7faaf9b833 [revision 1062] Author: Fiona Glaser Date: Fri Dec 26 07:35:49 2008 -0500 VLC table optimizations Slightly reorganize VLC tables for ~2% faster block_residual_write_cavlc. Also a small optimization in p8x8 CAVLC. commit 0ad4944b15f0de78a686e49353d61205aac6526f [revision 1061] Author: Loren Merritt Date: Wed Dec 24 22:58:17 2008 -0500 Fix crash in --me esa/tesa introduced in r1058 Also suppress the last mingw warning message commit 9fe6e5e6fbeb045787bc47fd1fd073855510d427 [revision 1060] Author: Fiona Glaser Date: Tue Dec 23 22:33:28 2008 -0500 Optimize variance asm + minor changes Remove SAD argument from var, not needed anymore. Speed up var asm a bit by eliminating psadbw and instead HADDWing at end. Eliminate all remaining warnings on gcc 3.4 on cygwin Port another minor optimization from lavc (pskip) commit 8761805b8240c0da5f9d6d79b1a2affe3b5213ad [revision 1059] Author: Fiona Glaser Date: Tue Dec 23 18:31:48 2008 -0500 Minor CABAC cleanups and related optimizations Merge the two list tables to allow cleaner MC/CABAC/CAVLC code Remove lots of unnecessary {s Port some very minor opts from lavc commit bc29c635327d79f6a5372df30477db28635e3846 [revision 1058] Author: Loren Merritt Date: Thu Dec 11 19:47:17 2008 +0000 faster ESA init reduce memory if using ESA and not p4x4 commit 8e5d63a544efb6eb0f6677f718033f049c1ccd56 [revision 1057] Author: Fiona Glaser Date: Mon Dec 15 23:02:49 2008 -0800 More macroblock_cache optimizations Patch partially by Loren Merritt commit f9307df88e39cafa30db47249895fdd1745cc1aa [revision 1056] Author: Fiona Glaser Date: Mon Dec 15 13:15:29 2008 -0800 Faster macroblock_cache_rect Explicit loop unrolling commit 69dc9f4dbe3283c4069bd4d1ddd4685510714375 [revision 1055] Author: Fiona Glaser Date: Sun Dec 14 18:30:51 2008 -0800 Optimizations in predict_mv_direct Add some early terminations and minor optimizations This change may also fix the extremely rare direct+threading MV bug. commit 9b8370b395349918b8e4171b2bcbeacf83d67231 [revision 1054] Author: David Wolstencroft Date: Sun Dec 14 10:47:28 2008 +0000 Fix visual corruption when picture width was not mod 32. The previous Altivec implemention of mc_chroma assumed that i_src_stride was always mod 16. commit 664a4e41959dcecf5030196a91a48aac667b8c35 [revision 1053] Author: Guillaume Poirier Date: Mon Dec 8 21:11:45 2008 +0100 Add support for FSF GCC version >= 4.3 on OSX. So far, only Apple GCC version was supported. commit fa6728b60d2ab3bef9e9ab29635672e0c6697c3d [revision 1052] Author: Fiona Glaser Date: Thu Dec 11 17:31:52 2008 -0800 More accurate refcost for p8x8 CAVLC Slightly better quality, especially in non-RD mode, with CAVLC. commit 6abf5d67010f8c3889f3184769e09f12fbe473c2 [revision 1051] Author: Loren Merritt Date: Wed Dec 10 20:54:17 2008 -0800 use lookup tables instead of actual exp/pow for AQ Significant speed boost, especially on CPUs with atrociously slow floating point units (e.g. Pentium 4 saves 800 clocks per MB with this change). Add x264_clz function as part of the LUT system: this may be useful later. Note this changes output somewhat as the numbers from the lookup table are not exact. commit b219d4fcfbb0ac904ff0e7e4ed67c3511e6596a8 [revision 1050] Author: Fiona Glaser Date: Wed Dec 10 20:53:13 2008 -0800 Suppress saveptr warnings on Windows GCC commit e0779152f7c9489ba89481272537e9ac0a1f733a [revision 1049] Author: Fiona Glaser Date: Wed Dec 10 20:52:06 2008 -0800 More small speed tweaks to macroblock.c commit 99448f6c98289e74f1234e38b9ed2c945f2bdfca [revision 1048] Author: Fiona Glaser Date: Mon Dec 8 13:44:23 2008 -0800 Much faster CAVLC residual coding Use a VLC table for common levelcodes instead of constructing them on-the-spot Branchless version of i_trailing calculation (2x faster on Nehalem) Completely remove array_non_zero_count and instead use the count calculated in level/run coding. Note: this slightly changes output with subme > 7 due to different nonzero counts being stored during qpel RD. commit 89a893a0b153cbc9fa5143aad15b17581b9b448b [revision 1047] Author: Guillaume Poirier Date: Fri Dec 5 22:26:55 2008 +0100 fix compilation with GCC-4.3+ commit fa800b23cc0e1eb5ca603e845d977952ee63ddd6 [revision 1046] Author: Fiona Glaser Date: Sat Nov 29 23:13:58 2008 -0800 High Profile allows 25% higher maxbitrate/cpb Correct level detection to take this into account. commit bf65c5583d7582d2f0446a2848b37f4663369135 [revision 1045] Author: BugMaster Date: Sat Nov 29 14:04:29 2008 -0800 s/nasm/yasm in VS project file commit 19ebada1c9a67ac0837936eb9269224cb3ce8dd7 [revision 1044] Author: Fiona Glaser Date: Sat Nov 29 04:49:18 2008 -0800 Cosmetic: update various file headers. commit 85c217958c677b164305582e3c6304bf42f1bac5 [revision 1043] Author: Loren Merritt Date: Sat Nov 29 11:54:02 2008 +0000 add date and compiler to `x264 --version` commit df72b08c60856a71f4a15634a6a87e0fe34bca15 [revision 1042] Author: Fiona Glaser Date: Fri Nov 28 14:32:11 2008 -0800 10L in r1041 commit c1d73389eaaebb29ca69f7436ea5a8a707a555c9 [revision 1041] Author: Fiona Glaser Date: Thu Nov 27 19:37:56 2008 -0800 Significantly faster CABAC and CAVLC residual coding and bit cost calculation Early-terminate in residual writing using stored nnz counts To allow the above, store nnz counts for luma and chroma DC Add assembly functions to find the last nonzero coefficient in a block Overall ~1.9% faster at subme9+8x8dct+qp25 with CAVLC, ~0.7% faster with CABAC Note this changes output slightly with CABAC RDO because it requires always storing correct nnz values during RDO, which wasn't done before in cases it wasn't useful. CAVLC output should be equivalent. commit ecb04a3ba99324dd6a319224a0dae4e3fa962b40 [revision 1040] Author: Fiona Glaser Date: Wed Nov 26 23:42:55 2008 -0800 dequant_4x4_dc assembly About 3.5x faster DC dequant on Conroe commit 6ce71ce7b935bdd7efb0c843dafd0c208194ab65 [revision 1039] Author: Loren Merritt Date: Thu Nov 27 02:37:46 2008 +0000 fix an overflow in dct4x4dc_mmx (unlikely to have occurred in any real video) commit c5c0a7fd77b039fcec891ab97e34d9d40fec6839 [revision 1038] Author: Fiona Glaser Date: Tue Nov 25 16:30:39 2008 -0800 Remove nasm support Nasm won't correctly parse the SSE4 code introduced a few revisions ago, so we're removing support. Users should upgrade to yasm 0.6.1 or later. commit 0e58d0373bb8586f78eb1b95221b347123689e3c [revision 1037] Author: BugMaster Date: Tue Nov 25 15:11:24 2008 -0800 Fix rare warning messages in ratecontrol due to r1020 commit 632e09999c9cf5828d80aa98ec357181607d4447 [revision 1036] Author: BugMaster Date: Tue Nov 25 15:10:43 2008 -0800 Fix MSVC compilation and clean up MSVC build file Remove Release64 which never worked anyways. commit 69e69197c424bff9e4b90eb5d608f15b59ca77b4 [revision 1035] Author: Fiona Glaser Date: Tue Nov 25 01:04:26 2008 -0800 Faster width4 SSD+SATD, SSE4 optimizations Do satd 4x8 by transposing the two blocks' positions and running satd 8x4. Use pinsrd (SSE4) for faster width4 SSD Globally replace movlhps with punpcklqdq (it seems to be faster on Conroe) Move mask_misalign declaration to cpu.h to avoid warning in encoder.c. These optimizations help on Nehalem, Phenom, and Penryn CPUs. commit e76caf368c7044fdd1eff6a423d9518e9818a4ba [revision 1034] Author: Guillaume Poirier Date: Tue Nov 25 17:27:27 2008 +0100 fix indentation, whitespace cleanup, more consistent indentation of macro backslashes commit 49b16f1d1ca3ebe90c43ec950ca279b842892061 [revision 1033] Author: David Wolstencroft Date: Sat Nov 22 17:54:38 2008 +0100 Change some macros to be more sensitive to memory alignment, thus avoiding useless loads/stores and calculations of permutation vectors. Affected functions are all of mc_luma, mc_chroma, 'get_ref', SATD, SA8D and deblock. Gains globally vary from ~5% - 15% on a depending on settings running on a 1.42 ghz G4. commit e56a842d2e5ae5e4cdc412adb73e7c952e0f29cb [revision 1032] Author: Loren Merritt Date: Fri Nov 7 05:31:24 2008 +0000 refactor satd. 20KB smaller binary. refactor sa8d. slightly faster. more checkasm for hadamard. commit d2c6e84dcafff29944ddb43ac58dbb7c23b33605 [revision 1031] Author: Fiona Glaser Date: Mon Nov 24 21:56:24 2008 -0800 Fix crash with threads and SSEMisalign on Phenom Misalign mask needed to be set separately for each encoding thread. commit 80ea99c001eaab58a0ff54f0b2c4815cb2e63076 [revision 1030] Author: Fiona Glaser Date: Fri Nov 21 03:39:11 2008 -0800 Phenom CPU optimizations Faster hpel_filter by using unaligned loads instead of emulated PALIGNR Faster hpel_filter on 64-bit by using the 32-bit version (the cost of emulated PALIGNR is high enough that the savings from caching intermediate values is not worth it). Add support for misaligned_mask on Phenom: ~2% faster hpel_filter, ~4% faster width16 multisad, 7% faster width20 get_ref. Replace width12 mmx with width16 sse on Phenom and Nehalem: 32% faster width12 get_ref on Phenom. Merge cpu-32.asm and cpu-64.asm Thanks to Easy123 for contributing a Phenom box for a weekend so I could write these optimizations. commit 7df060bedbc72232fdf48869cea47bcd480e8eda [revision 1029] Author: Fiona Glaser Date: Thu Nov 20 20:11:14 2008 -0800 A few tweaks to decimate asm A little bit faster on both 32-bit and 64-bit commit a99183d3685c26d6d7815d6be3fe28fcd77c94bf [revision 1028] Author: Fiona Glaser Date: Wed Nov 12 16:50:31 2008 -0800 Nehalem optimization part 2: SSE2 width-8 SAD Helps a bit on Phenom as well ~25% faster width8 multiSAD on Nehalem commit 4975e8187193c5f0bcc6b91b88c43e50482b8a1e [revision 1027] Author: Fiona Glaser Date: Mon Nov 10 23:34:02 2008 -0800 Add subme=0 (fullpel motion estimation only) Only for experimental purposes and ultra-fast encoding. Probably not a good idea for firstpass. commit ebe1103b4c8e2e7f61e294b65c70756c645fee49 [revision 1026] Author: Fiona Glaser Date: Mon Nov 10 15:34:48 2008 -0800 Fix minor memory leak in r1022 commit ac675e30c3c47a409d015b2d4f6d6f495f53e417 [revision 1025] Author: Fiona Glaser Date: Mon Nov 10 15:32:06 2008 -0800 r1024 borked checkasm Remove idct/dct2x2 from checkasm as they are no longer in dctf commit be1211807c0725803253d44f47cd6305ffbaddf9 [revision 1024] Author: Fiona Glaser Date: Sun Nov 9 17:39:21 2008 -0800 Faster chroma encoding 9-12% faster chroma encode. Move all functions for handling chroma DC that don't have assembly versions to macroblock.c and inline them, along with a few other tweaks. commit ae51235dd5ad1f9b6396a857f478b4f391cffcff [revision 1023] Author: Fiona Glaser Date: Sun Nov 9 17:34:31 2008 -0800 Various cosmetics and minor fixes Disable hadamard_ac sse2/ssse3 under stack_mod4 Fix one MSVC compilation warning Fix compilation in debug mode in certain cases on x64 Remove eval.c from MSVC project Fix crash when VBV is used in CQP mode Patches by MasterNobody commit 0c841de6810678f3da1c06a34595cb490d59eeb6 [revision 1022] Author: Fiona Glaser Date: Sat Nov 8 20:16:17 2008 -0800 Faster b-adapt + adaptive quantization Factor out pow to be only called once per macroblock. Speeds up b-adapt, especially b-adapt 2, considerably. Speed boost is as high as 24% with b-adapt 2 + b-frames 16. commit f2a12915c1df7df87a816d07d724e4f1f7b00729 [revision 1021] Author: Fiona Glaser Date: Fri Nov 7 11:39:43 2008 -0800 Faster CABAC residual encoding 6% faster block_residual_write_cabac in RD mode. commit a7831e46278fed3f7f907b7b1687b1f877e9fb1e [revision 1020] Author: Fiona Glaser Date: Wed Nov 5 19:51:59 2008 -0800 Fix potential crash in the case that the input statsfile is too short Also resolve various other potential weirdness (such as multiple copies of the same error message in threaded mode). commit 1bf7228f7e975e9220daae5a439797aaea2aa511 [revision 1019] Author: Fiona Glaser Date: Wed Nov 5 03:11:45 2008 -0800 Initial Nehalem CPU optimizations movaps/movups are no longer equivalent to their integer equivalents on the Nehalem, so that substitution is removed. Nehalem has a much lower cacheline split penalty than previous Intel CPUs, so cacheline workarounds are no longer necessary. Thanks to Intel for providing Avail Media with the pre-release Nehalem CPU needed to prepare these (and other not-yet-committed) optimizations. Overall speed improvement with Nehalem vs Penryn at the same clock speed is around 40%. commit fc321fd6ae4425eb2fba677eba5dc5ce36e98dd4 [revision 1018] Author: Gabriel Bouvigne Date: Tue Nov 4 09:56:03 2008 -0800 Fix potential infinite loop in VBV under GCC 4.2 commit 16e3ef85b4d1162ac46e4b6b384bc61481dbaf7a [revision 1017] Author: Fiona Glaser Date: Mon Nov 3 22:59:49 2008 -0800 Encoder_reconfig: esa/tesa can only be enabled if they were on to begin with Bug report by kemuri-_9. commit ca49901f75ba26ec4e1c7e0c448bfc759e78f961 [revision 1016] Author: Loren Merritt Date: Thu Oct 30 00:47:09 2008 -0700 Fix bug in hadamard_ac SSE assembly Some extreme inputs could cause overflows. commit fb1af79ef8b0c18a317a0582077f12ac63d6c9f0 [revision 1015] Author: Fiona Glaser Date: Tue Oct 28 20:35:15 2008 -0700 Full sub8x8 RD mode decision Small speed penalty with p4x4 enabled, but significant quality gain at subme >= 6 As before, gain is proportional to the amount of p4x4 actually useful in a given input at the given bitrate. commit e09f55ccc3b9ffee42d6ed6a86cbe88ef603b05b [revision 1014] Author: Fiona Glaser Date: Sat Oct 25 01:50:08 2008 -0700 Optimize CABAC bit cost calculation Speed up cabac mvd and add new precalculated transition/entropy table. Add "noup" function for cabac operations to not update the state table when it isn't necessary. 1-3% faster macroblock_size_cabac. Cosmetics commit b875aa64ce1e416347af55cac7326ab72456eb68 [revision 1013] Author: Anders Ossowicki Date: Thu Oct 23 22:36:11 2008 -0700 Replace "git-command" with "git command" in version.sh for git 1.6 support commit e9a6bd75f7203790a256d2cfb8838f2c06404410 [revision 1012] Author: Loren Merritt Date: Thu Oct 23 13:45:04 2008 -0700 Add assembly version of CAVLC 8x8dct interleave Faster CAVLC encoding and RDO with 8x8dct commit f151cc4b9bd513f06511519ddb89b1ee80a722eb [revision 1011] Author: Alexander Strange Date: Wed Oct 22 15:55:30 2008 -0700 Add support for psy-rd/trellis to encoder_reconfig commit f7cc3064f5edd0e75afed26904b57e6704dcaabf [revision 1010] Author: Alexander Strange Date: Wed Oct 22 15:00:43 2008 -0700 Fix Darwin speed regression commit 5254d366adc8b0a926df197cd96a689c8583a370 [revision 1009] Author: Gabriel Bouvigne Date: Wed Oct 22 14:48:47 2008 -0700 Further improve prediction of bitrate and VBV in threaded mode commit 5993b7e968fc1154a4fae417fb9dbb0157c60cf8 [revision 1008] Author: Fiona Glaser Date: Wed Oct 22 13:37:09 2008 -0700 Sub-8x8 Qpel-RD in P-frames Improves quality when using p8x4/p4x8/p4x4 subpartitions Benefit is proportional to how many sub-8x8 partitions are used; helps most at high bitrates and low resolutions. commit fe5f0a473508ef3154c7f0809b8500c3ddd5eee2 [revision 1007] Author: Fiona Glaser Date: Wed Oct 22 02:20:06 2008 -0700 Faster qpel-RD 3-4% faster qpel-RD; avoid re-checking bmv/pmv during the hex search. commit d17e81e26ae3fc3093182c00af254ceac9bed21f [revision 1006] Author: Fiona Glaser Date: Wed Oct 22 00:37:00 2008 -0700 Some minor optimizations in RD refinement Don't write b subpartition in CABAC RDO Calculate nonzero count in i4x4 CAVLC RDO commit 91522693403f9bc06985f8e4e9aebb6d4b43fc5a [revision 1005] Author: Fiona Glaser Date: Tue Oct 21 20:17:18 2008 -0700 Faster deblocking when p4x4 isn't used Most of the MV checks can be skipped, resulting in faster strength calculation commit 09c6a0b2812d2f60c69658b62c22ac8a71ba39a9 [revision 1004] Author: Fiona Glaser Date: Tue Oct 21 19:38:21 2008 -0700 Print profile and level information upon starting encode Previously level was only printed as part of autodetect, and only in verbose mode. commit 21afe78c85469ae11e7ad638c0a9f958c6573933 [revision 1003] Author: Fiona Glaser Date: Tue Oct 21 17:10:46 2008 -0700 Fix possible crash in trellis at very low QPs commit d1fbc652674e9aed6c83ecb321b09891ea5c7e05 [revision 1002] Author: Fiona Glaser Date: Tue Oct 21 14:59:07 2008 -0700 Add assembly versions of decimate_score 3-7x faster decimation, 1-3% faster overall commit 8d6b262d3c806ae4e8380a6b0c6d31c6c105dba7 [revision 1001] Author: Fiona Glaser Date: Sat Oct 18 03:40:59 2008 -0700 Fix typo in subme8/9 lossless qpel-RD Slightly improves compression. commit a516e8e497f098985152aa047de95ffd20b578bb [revision 1000] Author: Fiona Glaser Date: Thu Oct 16 03:17:53 2008 -0700 Extend trellis to support luma/chroma DC and chroma AC Small speed loss in trellis 1, slightly larger in trellis 2, but significant quality improvement. commit e21bc3443d5717d0960130486f6b8b712d2be8df [revision 999] Author: Loren Merritt Date: Thu Oct 2 20:57:08 2008 -0600 rm gtk, avc2avi. I don't remember why I allowed a gui into the repository in the first place. There's nothing that makes this one special relative to all the other x264 guis. avc2avi doesn't compile since we removed the bitstream reader. And avc doesn't belong in avi. commit be4be30ff33ccf0cbe7ed5f275e89c87b5927c86 [revision 998] Author: Fiona Glaser Date: Thu Oct 2 18:11:13 2008 -0700 Resolve quality regression in r996 Accidentally removed the wrong line of code. I think this classifies as a "10l". Thanks to techouse for initial bug report and skystrife for helping me find it. commit 9df640c440e2a5b51683de01ba5a76e72ecc44f3 [revision 997] Author: Ralf Terdic Date: Thu Oct 2 08:52:33 2008 -0700 Fix minor memory leak accidentally added with the addition of b-adapt 2 commit 60455fff82906da0237a4f56b3686a588579e41f [revision 996] Author: Fiona Glaser Date: Tue Sep 30 18:34:56 2008 -0700 Rework subme system, add RD refinement in B-frames The new system is as follows: subme6 is RD in I/P frames, subme7 is RD in all frames, subme8 is RD refinement in I/P frames, and subme9 is RD refinement in all frames. subme6 == old subme6, subme7 == old subme6+brdo, subme8 == old subme7+brdo, subme9 == no equivalent --b-rdo has, accordingly, been removed. --bime has also been removed, and instead enabled automatically at subme >= 5. RD refinement in B-frames (subme9) includes both qpel-RD and an RD version of bime. commit 9b10152ffdc006d98a1ddea8ee19d1fdc70a0141 [revision 995] Author: Fiona Glaser Date: Mon Sep 29 00:11:38 2008 -0700 Fix potential miscompilation of some inline asm Caused problems under some gcc 4.x versions with predictive lossless commit a9e86d248d8d5f1e892159a7d86dcea2f884a859 [revision 994] Author: Fiona Glaser Date: Sat Sep 27 16:37:27 2008 -0700 Replace High 4:4:4 profile lossless with High 4:4:4 Predictive. This improves lossless compression by about 4-25% depending on source. The benefit is generally higher for intra-only compression. Also add support for 8x8dct and i8x8 blocks in lossless mode; this improves compression very slightly. In some rare cases 8x8dct can hurt compression in lossless mode, but its usually helpful, albeit marginally. Note that 8x8dct is only available with CABAC as it is never useful with CAVLC. High 4:4:4 Predictive replaced the previous profile in a 2007 revision to the H.264 standard. The only known compliant decoder for this profile is the latest version of CoreAVC. As I write this, JM does not actually correctly decode this profile. Hopefully this lack of support will soon change with this commit, as x264 will be (to my knowledge) the first compliant encoder. commit adccf49a631a9e424dc0e86476752d511065582d [revision 993] Author: Fiona Glaser Date: Fri Sep 26 09:19:56 2008 -0700 Fix typo in progress indicator when using piped input commit cb173c5044fcc4792b7978720884cea7aa2e3848 [revision 992] Author: Loren Merritt Date: Mon Sep 22 04:17:35 2008 -0600 avg_weight_ssse3 commit 3e5b130aba1ae8e1cc49b9a7ddf138abe6d78934 [revision 991] Author: Loren Merritt Date: Sat Sep 20 08:41:17 2008 -0600 fix bitstream writer on bigendian 64bit (regression in r903) commit 8292f8945ade54d7ac1e171c94bfe67409c41b20 [revision 990] Author: Loren Merritt Date: Fri Sep 19 23:52:11 2008 -0600 remove authors whose code no longer exists commit a70f8802d945d4fdd061a661b9af0a432362903e [revision 989] Author: Loren Merritt Date: Mon Sep 15 05:00:26 2008 -0600 more diagnostics when configure finds an unsuitable assembler commit 2103b3579b617a7d51d9b5e61953bc5531232948 [revision 988] Author: Fiona Glaser Date: Fri Sep 26 09:19:56 2008 -0700 Make x264 progress indicator more concise Now the % indicator should be readable on the header of a minimized window on Windows systems. commit cd5919121f613431f483b52452f92e8195217974 [revision 987] Author: Fiona Glaser Date: Sun Sep 21 22:17:34 2008 -0700 Fix deblocking + threads + AQ bug At low QPs, with threads and deblocking on, deblocking could be improperly disabled. Revision in which this bug was introduced is unknown; it may be as old as b_variable_qp in x264 itself. commit c7d9960a8d91e1fdd207fc7ad7c6f130f573e53f [revision 986] Author: Fiona Glaser Date: Sun Sep 21 13:35:00 2008 -0700 Resolve possible crash in bime, improve the fix in r985 commit fab9d57a8c1327b97c01b83f5e9d58315622250c [revision 985] Author: Fiona Glaser Date: Sat Sep 20 19:36:07 2008 -0700 Fix rare crash issue in b-adapt Regression *probably* in r979 commit 78798908acafad0ec536bcf6a81a95f50f5461a4 [revision 984] Author: Holger Lubitz Date: Sat Sep 20 02:36:55 2008 -0700 Merging Holger's GSOC branch part 1: hpel_filter speedups commit 20c01eefb707ad5c7291cd882d118ba8f6cf9d9a [revision 983] Author: Loren Merritt Date: Sat Sep 20 12:31:10 2008 -0600 r980 borked weighted bime commit 57c472ce6a12188041a04213543f2394d74962af [revision 982] Author: Fiona Glaser Date: Sat Sep 20 01:39:16 2008 -0700 Disable I_PCM with psy-RD psy-RD seems to put the PCM threshold a bit lower than it should be, so PCM is now disabled under psy-RD. commit 42d57caaaf2de55d131e677a7fa8231148432435 [revision 981] Author: Fiona Glaser Date: Fri Sep 19 09:21:34 2008 -0700 Merge avg and avg_weight avg_weight no longer has to be special-cased in the code; faster weightb commit b7d27eaab35a6fdffc66ffff51bd287b0f67bb3e [revision 980] Author: Fiona Glaser Date: Wed Sep 17 21:25:05 2008 -0700 Rewrite avg/avg_weight to take two source pointers This allows the use of get_ref instead of mc_luma almost everywhere for bipred commit c4f3dabecde673fabcefa832fff490af9d738641 [revision 979] Author: Fiona Glaser Date: Wed Sep 17 00:33:37 2008 -0700 Use low-resolution lookahead motion vectors as an extra predictor Improves quality considerably (0-5%) in 1pass/CRF mode, especially with lower --me values and complex motion. Reverses the order of lowres lookahead search to improve the usefulness of the extra predictors. commit f8f5313909c20420a6e6efd69cbe8ec5147a12ac [revision 978] Author: Fiona Glaser Date: Tue Sep 16 22:44:10 2008 -0700 Add missing free() for f_qp_offset in frame.c commit d8163ffd10fb290520d096d8b05cae2f727ac9bf [revision 977] Author: Gabriel Bouvigne Date: Tue Sep 16 01:54:37 2008 -0700 Correct misprediction of bitrate in threaded mode Improves bitrate accuracy in cases with large numbers of threads. Loosely based on a patch by BugMaster. commit 08e737d12b3eaf4c2d6c1b8bbcd18628684221eb [revision 976] Author: Gabriel Bouvigne Date: Tue Sep 16 01:53:02 2008 -0700 Fix a case in which VBV underflows can occur Fix a potential case where a frame might be initially allocated too low a QP, which would then have to be raised a low during row-based ratecontrol. In some cases, this could even produce VBV underflows in 2pass mode. commit bdb435f73dda80e54ae6b4f5c861bd62ed99ed3d [revision 975] Author: Panagiotis Issaris Date: Mon Sep 15 20:47:50 2008 +0200 Use correct format specifier for uint64_t commit c299b7d87e9ee6e7fca6b7d234847cc20eacc688 [revision 974] Author: Fiona Glaser Date: Tue Sep 16 00:31:26 2008 -0700 Cache motion vectors in lowres lookahead This vastly speeds up b-adapt 2, especially at large bframes values. This changes output because now MV prediction in lookahead only uses L0/L1 MVs, not bidir. This isn't a problem, since the bidir prediction wasn't really correct to begin with, so the change in output is neither positive nor negative. This also allowed the removal of some unnecessary memsets, which should also give a small speed boost. Finally, this allows the use of the lowres motion vectors for predictors in some future patch. commit 44d3d5ba678ba24d90720d0883dddcda8832de03 [revision 973] Author: Fiona Glaser Date: Mon Sep 15 12:22:48 2008 -0700 Fix regression in b-adapt patch: encoder_open failed for multipass encodes without bframes. commit 58a770fe6a6f6777851a6fbb1c14043b9c0eff2a [revision 972] Author: Fiona Glaser Date: Mon Sep 15 10:53:29 2008 -0700 Stop SAR in y4m input from overriding --sar on commandline commit a8cb7662d9ffdab42c83074aee3835b3b0104c73 [revision 971] Author: Loren Merritt Date: Mon Sep 15 02:24:12 2008 -0600 hadamard_ac for psy-rd c version is 1.7x faster than satd+sa8d+sad ssse3 version is 2.3x faster than satd+sa8d+sad commit ecc9bfab548f464d4c2be899055f7ba567c1ed8e [revision 970] Author: Fiona Glaser Date: Sun Sep 14 21:36:45 2008 -0700 Psychovisually optimized rate-distortion optimization and trellis The latter, psy-trellis, is disabled by default and is reserved as experimental; your mileage may vary. Default subme is raised to 6 so that psy RD is on by default. commit 95ed2720b7772199f04cc9a657632107bb1c548c [revision 969] Author: Fiona Glaser Date: Sun Sep 14 18:18:15 2008 -0700 Add optional more optimal B-frame decision method This method (--b-adapt 2) uses a Viterbi algorithm somewhat similar to that used in trellis quantization. Note that it is not fully optimized and is very slow with large --bframes values. It also takes into account weightb, which should improve fade detection. Additionally, changes were made to cache lowres intra results for each frame to avoid recalculating them. This should improve performance in both B-frame decision methods. This can also be done for motion vectors, which will dramatically improve b-adapt 2 performance when it is complete. This patch also reads b_adapt and scenecut settings from the first pass so that the x264 header information in the output file will have correct information (since frametype decision is only done on the first pass). commit 80458ffcd62f0852e7092176b7b155bdfd3d5a82 [revision 968] Author: Fiona Glaser Date: Sat Sep 13 14:03:12 2008 -0700 Move adaptive quantization to before ratecontrol, eliminate qcomp bias This change improves VBV accuracy and improves bit distribution in CRF and 2pass. Instead of being applied after ratecontrol, AQ becomes part of the complexity measure that ratecontrol uses. This allows for modularity for changes to AQ; a new AQ algorithm can be introduced simply by introducing a new aq_mode and a corresponding if in adaptive_quant_frame. This also allows quantizer field smoothing, since quantizers are calculated beofrehand rather during encoding. Since there is no more reason for it, aq_mode 1 is removed. The new mode 1 is in a sense a merger of the old modes 1 and 2. WARNING: This change redefines CRF when using AQ, so output bitrate for a given CRF may be significantly different from before this change! commit f89e0d06700620d4e2f1467e80995f8192182496 [revision 967] Author: Fiona Glaser Date: Tue Sep 9 23:51:17 2008 -0700 Fix crash when using b-adapt at resolutions 32x32 or below. Original patch by BugMaster, but was mostly rewritten in order to make b-adapt actually *work* at such resolutions, not merely stop crashing. commit d24f8a9153e67f2f9529283aeb523806fab17fe1 [revision 966] Author: Fiona Glaser Date: Tue Sep 9 23:12:20 2008 -0700 Add title-bar progress indicator under WIN32 Also add bitrate-so-far output when piping data to x264 (total frames not known) Patch mostly by recover from Doom9. commit 654e549862a0bf56671de13d38ad5c512d2a9efe [revision 965] Author: Fiona Glaser Date: Fri Sep 5 23:14:23 2008 -0700 Revert part of r963 In some rare (but significant) cases, the optimized nal_encode algorithm gave incorrect results. commit cc0c3d4d1e639512e2b9003a68597fdb6ce00d4f [revision 964] Author: Fiona Glaser Date: Thu Sep 4 20:13:38 2008 -0700 Predict 4x4_DC asm Also remove 5-year-old unnecessary #define that reduced speed unnecessarily under MSVC-compiled builds commit 5993fccac7646d84710b4ffc6feb3f3b4fd736d8 [revision 963] Author: Fiona Glaser Date: Thu Sep 4 00:43:54 2008 -0700 Faster NAL unit encoding and remove unused nal_decode Small speedup at very high bitrates commit 5d0904bfda094b6243d9d8596c50edd4f0fe5528 [revision 962] Author: Fiona Glaser Date: Wed Sep 3 22:12:23 2008 -0700 CAVLC cleanup and optimizations Also move some small functions in macroblock.c to a .h file so they can be inlined. commit 277d2da8958b5e08d119c0068a54842bc5c3af71 [revision 961] Author: Fiona Glaser Date: Wed Sep 3 21:43:06 2008 -0700 Faster avg_weight assembly Unrolling the loop a bit improves performance commit 1af195341d0f210382827b43051c79e33d900989 [revision 960] Author: Fiona Glaser Date: Wed Sep 3 15:35:22 2008 -0700 Faster H asm intra prediction functions Take advantage of the H prediction method invented for merged intra SAD and apply it to regular prediction, too. commit 4d84a45d7e505e4929a0110e047aa29a752e3253 [revision 959] Author: Fiona Glaser Date: Wed Sep 3 15:32:16 2008 -0700 Add merged SAD for i16x16 analysis Roughly 30% faster i16x16 analysis under subme=1 commit 2bff50702978bf2af30ef2b58264bd71549bc702 [revision 958] Author: Fiona Glaser Date: Wed Sep 3 15:15:17 2008 -0700 Add sad_aligned for faster subme=1 mbcmp Distinguish between unaligned and aligned uses of mbcmp SAD_aligned, for MMX SADs, uses non-cacheline SADs. commit fc36067b632e611f7b0e056381dd641d469376e6 [revision 957] Author: Fiona Glaser Date: Tue Sep 2 11:49:55 2008 -0700 Improve progress indicator Show average bitrate so far during encoding Decrease update interval for longer encodes (max of 10 frames encoded between updates) commit ce21e79df8197abaa35d7a6838a1aedeb4411578 [revision 956] Author: Fiona Glaser Date: Mon Sep 1 10:35:41 2008 -0700 Fix speed regression in r951 Row SATDs are only necessary in VBV mode, so don't need to be checked if VBV is off. commit 8957bad80bb17eb23a39b86e903f8058d79d7364 [revision 955] Author: Holger Lubitz Date: Sun Aug 31 20:55:50 2008 -0600 zigzag asm commit 44d9c160bf45b827be9f99a91d8f53062246873d [revision 954] Author: Guillaume Poirier Date: Sun Aug 31 21:46:31 2008 +0200 fix SOFLAGS used when building gtk frontend patch by Markus Kanet %darkvision A gmx P eu% commit 1e393b8c8139a39e8ff9bfc782e866598f3e2615 [revision 953] Author: Loren Merritt Date: Wed Aug 20 20:56:56 2008 -0600 remove the distinction between itex and ptex (changes 2pass statsfile format) commit 9ccd80faeec2b8baa565b5f2d577cf3f79efd2e7 [revision 952] Author: Loren Merritt Date: Wed Aug 20 20:51:39 2008 -0600 hardcode the ratecontrol equation, and remove the rceq option commit 79f4e3e270ebffc8663f35e9959abdd369b6f914 [revision 951] Author: Fiona Glaser Date: Wed Aug 27 13:14:36 2008 -0400 Fix some uses of uninitialized row_satd values in VBV Resolves some issues with QP51 in I-frames with scenecut commit 8de7dbbec1bc754826227c67cba74ad8a225cfde [revision 950] Author: Fiona Glaser Date: Tue Aug 26 14:51:29 2008 -0400 Activate trellis in p8x8 qpel RD Also clean up macroblock.c with some refactoring Note that this change significantly reduces subme7+trellis2 performance, but improves quality. Issue originally reported by Alex_W. commit 59de6938d16da6e79e572a41c2bbd9afc29e0a35 [revision 949] Author: Gabriel Bouvigne Date: Mon Aug 25 10:50:45 2008 -0400 Improve VBV accuracy Don't use the previous frame's row SATD as a predictor if it is too different from this frame's row SATD. commit 7421c8cf8587c6fa0ac8cd61ba1ffb9c20099c2d [revision 948] Author: Guillaume Poirier Date: Fri Aug 22 21:05:37 2008 +0200 improve generation of Darwin libraries Patch by vmrsss %vmrsss A gmail P com% commit 7086a2037ebb9bd45eec3cfa3372e0d04b7a2c31 [revision 947] Author: Fiona Glaser Date: Thu Aug 21 21:23:08 2008 -0400 Fix compilation in gcc 3.4.x (issue in r946) Due to a bug in gcc 3.4.x, in certain cases of inlining, the array_non_zero_int_mmx inline asssembly is miscompiled and causes a crash with --subme 7 --8x8dct. This minor hack fixes this issue. commit 20e8982e3196bf8d0820772571e75a50cd07aabe [revision 946] Author: Loic Le Loarer Date: Thu Aug 21 04:19:24 2008 -0600 shut up various gcc warnings commit 782740d5e4865f9bff83f8ac4f9b23fcf0f492f6 [revision 945] Author: Loren Merritt Date: Thu Aug 21 04:15:49 2008 -0600 fix a crash with invalid args and --thread-input (introduced in r921) commit 5a8727adddf4fc0f282c233c3a175f63ed41f211 [revision 944] Author: Loren Merritt Date: Wed Aug 20 05:36:32 2008 -0600 drop support for x86_32 PIC. commit 6dd4c075d8a1654ae928ea30ef7bdaf19a239cd9 [revision 943] Author: Loren Merritt Date: Tue Aug 19 01:55:57 2008 -0600 use permute macros in satd move some more shared macros to x264util.asm commit 08d39756a08e00ae39196b125e2cecdb08136e17 [revision 942] Author: Loren Merritt Date: Wed Aug 20 20:32:13 2008 -0600 cosmetics commit c47120f04f7c805955556d3466418d9eb347af52 [revision 941] Author: Loren Merritt Date: Wed Aug 20 19:00:52 2008 -0600 r940 broke threads commit 968609dc2c8c09f6a11f1a47755667d34b3736b0 [revision 940] Author: Fiona Glaser Date: Wed Aug 20 13:28:15 2008 -0400 Cleanups in macroblock_cache_save/load A bit more loop unrolling, and moving some constant code to the global init function commit 3b60ca85fb3bb8632a50378aac7fc21fce888cc5 [revision 939] Author: Fiona Glaser Date: Tue Aug 19 14:18:24 2008 -0600 Deblocking code cleanup and cosmetics Convert the style of the deblocking code to the standard x264 style Eliminate some trailing whitespace commit 8cbe60572ed19382f315af2c9f3ff267f91ccdd2 [revision 938] Author: Fiona Glaser Date: Mon Aug 18 23:03:37 2008 -0600 4% faster deblock: special-case macroblock edges Along with a bit of related code reorganization and macroification commit 45e367903b6af53319035d677f96d30514aa26ea [revision 937] Author: David Pethes Date: Sat Aug 16 09:43:26 2008 -0600 Add dedicated variance function instead of using SAD+SSD Faster variance calculation commit 2597644146f2469380b4c7073831b0a09116d79f [revision 936] Author: Loren Merritt Date: Fri Aug 15 03:04:28 2008 -0600 6% faster deblock: remove some clips, earlier termiantion on low qps. commit ddee314e91a679c9934d7482524a835b7c74fe1e [revision 935] Author: Fiona Glaser Date: Thu Aug 14 19:31:42 2008 -0600 Faster deblocking Early termination for bS=0, alpha=0, beta=0 Refactoring, various other optimizations About 30% faster deblocking overall. commit 144001ed6bafc658ff6212981c01c5480286af8e [revision 934] Author: Loren Merritt Date: Sat Aug 2 08:19:50 2008 -0600 asm cosmetics commit 95b2dd9926f2ca4722610f1d46d907a170539d51 [revision 933] Author: Daniel Vergien Date: Wed Aug 6 08:10:53 2008 -0600 yet another posix-emulating define on solaris commit 79c9a1d230f15f5afd041a469bda33ace9a86d50 [revision 932] Author: Gabriel Bouvigne Date: Wed Aug 6 07:45:05 2008 -0600 update msvc projectfile commit 56b3baeccec26d2498d1dd86f0852848a820b7a0 [revision 931] Author: Loren Merritt Date: Wed Aug 6 07:34:42 2008 -0600 drop support for msvc6 commit d17f473df108af93c7696c6a144717db9cc8c71c [revision 930] Author: Fiona Glaser Date: Sat Aug 9 09:36:04 2008 -0600 Prevent VBV from lowering quantizer too much This code seemed to act up unexpectedly sometimes, creating a situation where in 1-pass VBV mode, a frame's quantizer would drop all the way to qpmin and then shoot back upwards to qpmax, causing serious visual issues. This change may decrease bitrate in VBV mode, but that is preferable to the artifacting produced by this code. commit 1eb8b071a232873e40e001ec7379a917265bf372 [revision 929] Author: Fiona Glaser Date: Sat Aug 9 09:34:37 2008 -0600 Improve subme7 at low QPs and add subme7 support in lossless mode commit 01d7deaf4e1129fc5037740c18c4c199ca2ad275 [revision 928] Author: Loren Merritt Date: Wed Jul 30 22:35:20 2008 -0600 cosmetics: merge x86inc*.asm commit 3b6d783faba57af702040b47eb2585cf5b216356 [revision 927] Author: Fiona Glaser Date: Wed Jul 30 15:29:46 2008 -0600 Add missing x264util.asm commit 5914efe709e8be23a05253d96bf1eb2cfaa0c83c [revision 926] Author: Fiona Glaser Date: Wed Jul 30 15:28:21 2008 -0600 Basic sanity checking of qpmax/qpmin options commit ff7639b042d066be7e6a26ba23bdb9804457d644 [revision 925] Author: Fiona Glaser Date: Wed Jul 30 14:42:29 2008 -0600 Fix regression in r922 set the chroma DC coefficients to zero for residual coding in qpel-rd fix C99ism commit 543601b8ae441c60b27001bc03db4e7ff8db4fef [revision 924] Author: Holger Lubitz Date: Tue Jul 29 21:36:01 2008 -0600 Refactor asm macros part 2: DCT commit 60f7c47de10a240cb50568996ff8232726c19881 [revision 923] Author: Holger Lubitz Date: Tue Jul 29 21:26:58 2008 -0600 Refactor asm macros part 1: DCT commit 63b84fa435de4355abc5e80fdc78a5d3081addc6 [revision 922] Author: Fiona Glaser Date: Tue Jul 29 17:08:38 2008 -0600 Improve intra RD refine, speed up residual_write_cabac a do/while loop can be used for residual_write, but i8x8 had to be fixed so that it wouldn't call residual_write with zero coeffs proper nnz handling added to cabac intra rd refine chroma cbp added to 8x8 chroma rd cbp was tested, but wasn't useful commit 9da8410dfd8877576438c909a3311688c19d6104 [revision 921] Author: Fiona Glaser Date: Tue Jul 29 13:42:41 2008 -0600 Fix a few more minor memleaks commit 1b078852bc942a773baf371f460aa6e471076d44 [revision 920] Author: Loren Merritt Date: Fri Jul 25 18:14:31 2008 -0600 stats summary: print distribution of numbers of consecutive B-frames commit 6a85cf3434816ac7e7f8772f78f07f9b3934e2ee [revision 919] Author: Loic Le Loarer Date: Fri Jul 25 16:08:32 2008 -0600 add interlacing to the list of stuff checked by x264_validate_levels commit 5a9231a860ea07aab3c5405fda2371903a4b2b93 [revision 918] Author: Fiona Glaser Date: Thu Jul 24 07:58:50 2008 -0600 Fix C99-ism in r907 commit 502baa8a5f4271b99a35e79f0604f4bf6f541d22 [revision 917] Author: Fiona Glaser Date: Thu Jul 17 18:17:22 2008 -0600 Faster temporal predictor calculation Split into a separate commit because this changes rounding, and thus changes output slightly. commit a6cee0ab6d2e6a9fb6580827dc854c09567c74f0 [revision 916] Author: Fiona Glaser Date: Thu Jul 17 07:55:24 2008 -0600 Align lowres planes for improved cacheline split performance commit 579e930f34e88196fa96ca576b78891f3df69c87 [revision 915] Author: Loren Merritt Date: Tue Jul 15 20:16:16 2008 -0600 autodetect level based on resolution/bitrate/refs/etc, rather than defaulting to L5.1 if vbv is not enabled (and especially in crf/cqp), we have to guess max bitrate, so we might underestimate the required level. commit 95e859854267062a9f48143faf334cdff7f564e4 [revision 914] Author: Loren Merritt Date: Thu Jul 17 20:25:03 2008 -0600 fix bs_write_ue_big for values >= 0x10000. (no immediate effect, since nothing writes such values yet) commit 7070f098260c188b2f138b44def409108e8f2449 [revision 913] Author: BugMaster Date: Wed Jul 16 11:54:51 2008 -0600 Fix lossless mode borked in r901 commit 6916c39a51544070dec5b59fd03e571f74af06a1 [revision 912] Author: Fiona Glaser Date: Sat Jul 12 14:37:58 2008 -0600 Relax QPfile restrictions Allow a QPfile to contain fewer frames than the total number of frames in the video and have ratecontrol fill in the rest. Patch by kemuri9. commit 299820827918a7586e40dac9f9dd62221350506d [revision 911] Author: Fiona Glaser Date: Sat Jul 12 14:10:38 2008 -0600 Limit MVrange correctly in interlaced mode Bug report by Sigma Designs, Inc. commit 0e0904d2841960cd2f57d8ec1e873545dccf3522 [revision 910] Author: Fiona Glaser Date: Fri Jul 11 22:53:27 2008 -0600 Fix bug with PCM and adaptive quantization In rare cases CABAC desync could occur, causing bitstream corruption commit 1b7446bf3f35e3f680824371c00cd1a1d98eaf76 [revision 909] Author: BugMaster Date: Fri Jul 11 16:00:02 2008 -0600 Fix memory leak upon x264 closing Doesn't affect the CLI, but potentially important for programs which call x264 as a shared library. commit 59f016f9590de2cea68b82661599061a044840af [revision 908] Author: Fiona Glaser Date: Fri Jul 11 15:45:54 2008 -0600 Fix compilation on PPC systems (borked in r903) Bigendian systems didn't have endian_fix32 defined commit 13575fcdc791b05c26d48c04050f956620cc41e5 [revision 907] Author: Fiona Glaser Date: Fri Jul 11 14:16:18 2008 -0600 Add L1 reflist and B macroblock types to x264 info Also remove display of "PCM" if PCM mode is never used in the encode. L1 reflist information will only show if pyramid coding is used. commit 6b4ad5f53899a3eafff4307e98fae18998677568 [revision 906] Author: Fiona Glaser Date: Thu Jul 10 08:36:45 2008 -0600 Fix and enable I_PCM macroblock support In RD mode, always consider PCM as a macroblock mode possibility Fix bitstream writing for PCM blocks in CAVLC and CABAC, and a few other minor changes to make PCM work. PCM macroblocks improve compression at very low QPs (1-5) and in lossless mode. commit 05d7fb66d6aaaad00d833c810aee05b0c89948f9 [revision 905] Author: Loren Merritt Date: Fri Jul 4 21:03:26 2008 -0600 de-duplicate vlc tables commit 91e0ff6b490a01f5f5438639ed517cb4b09802f0 [revision 904] Author: Loren Merritt Date: Fri Jul 4 18:56:30 2008 -0600 faster ue/se/te write commit ab90da748df305101b720f932736dd6d7f990214 [revision 903] Author: Fiona Glaser Date: Fri Jul 4 18:32:32 2008 -0600 faster bs_write commit c61a1df1db0226cae8bd0b1b5be7e0856e0cb26c [revision 902] Author: Loren Merritt Date: Thu Jul 3 00:37:16 2008 -0600 cosmetics in ssd asm commit c9c7edf3e6fa8fbdd4d7bf2beccb448bdcac9aa4 [revision 901] Author: Fiona Glaser Date: Sun Jul 6 12:59:15 2008 -0600 Various optimizations and cosmetics Update AUTHORS file with Gabriel and me update XCHG macro to work correctly in if statements Add new lookup tables for block_idx and fdec/fenc addresses Slightly faster array_non_zero_count_mmx (patch by holger) Eliminate branch in analyse_intra Unroll loops in and clean up chroma encode Convert some for loops to do/while loops for speed improvement Do explicit write-combining on --me tesa mvsad_t struct Shrink --me esa zero[] array Speed up bime by reducing size of visited[][][] array commit 653249521805b21564c00148f7db1e4b28e6e15c [revision 900] Author: Fiona Glaser Date: Sun Jul 6 11:15:19 2008 -0600 Resolve floating point exception with frame_init_lowres mmx In some cases, the mmx version of frame_init_lowres could leave the FPU uninitialized for use in ratecontrol, resulting in floating point exceptions. Since frame_init_lowres is such a time-consuming function, an emms was just put at the end, since it costs almost nothing compared to the total time of frame_init_lowres. commit 552a04ea3c56317046686bdc41d31e15490f6b85 [revision 899] Author: Eric Petit Date: Fri Jul 4 11:31:32 2008 +0200 Update my email address commit bdbd4fe7709e129f90cf3d7d59b500e915c6b187 [revision 898] Author: Fiona Glaser Date: Thu Jul 3 20:05:00 2008 -0600 Update file headers throughout x264 Update "Authors" lists based on actual authorship; highest is most important Update copyright notices and remove old CVS tags from file headers Add file headers to GTK and other sections missing them Update FSF address Other header-related cosmetics commit fb660325d99298ab6cd2285d76f2fddf83fe34cb [revision 897] Author: Fiona Glaser Date: Wed Jul 2 20:59:24 2008 -0600 denoise_dct asm commit 223eedb04b9d97f2b20bde8136959e101bb3e0c9 [revision 896] Author: Loren Merritt Date: Wed Jul 2 20:55:10 2008 -0600 cosmetics in permutation macros SWAP can now take mmregs directly, rather than just their numbers commit 5b92682dcee03054ef6f033c9dde6ecd251674fa [revision 895] Author: Fiona Glaser Date: Wed Jul 2 10:43:57 2008 -0600 Fix bug in adaptive quantization In some cases adaptive quantization did not correctly calculate the variance. Bug reported by MasterNobody commit 04dc25367d218c92ba85c4cae34cc8b36bab05a3 [revision 894] Author: Loren Merritt Date: Sun Jun 29 00:00:03 2008 -0600 lowres_init asm rounding is changed for asm convenience. this makes the c version slower, but there's no way around that if all the implementations are to have the same results. commit a59f4a7b6bfc12bcd8763de6b008f1bb753b2dae [revision 893] Author: Fiona Glaser Date: Tue Jul 1 23:42:39 2008 -0600 Optimizations and cosmetics in macroblock.c If an i4x4 dct block has no coefficients, don't bother with dequant/zigzag/idct. Not useful for larger sizes because the odds of an empty block are much lower. Cosmetics in i16x16 to be more consistent with other similar functions. Add an SSD threshold for chroma in probe_skip to improve speed and minimize time spent on chroma skip analysis. Rename lambda arrays to lambda_tab for consistency. commit ed9a9f313240c887a7a3b330ceabe25fccbf47db [revision 892] Author: Gabriel Bouvigne Date: Thu Jun 26 21:09:55 2008 -0600 some asm functions require aligned stack. disable these when compiling with msvc/icc. commit e9369576747d339078b57fc227302f8c6e79011a [revision 891] Author: Fiona Glaser Date: Tue Jun 24 15:27:41 2008 -0600 Move bitstream end check to macroblock level Additionally, instead of silently truncating the frame upon reaching the end of the buffer, reallocate a larger buffer instead. commit ec3d09554addbcecb8cf82f3ff33ac737a6f996b [revision 890] Author: Fiona Glaser Date: Tue Jun 24 12:23:50 2008 -0600 Convert NNZ to raster order and other optimizations Converting NNZ to raster order simplifies a lot of the load/store code and allows more use of write-combining. More use of write-combining throughout load/save code in common/macroblock.c GCC has aliasing issues in the case of stores to 8-bit heap-allocated arrays; dereferencing the pointer once avoids this problem and significantly increases performance. More manual loop unrolling and such. Move all packXtoY functions to macroblock.h so any function can use them. Add pack8to32. Minor optimizations to encoder/macroblock.c commit d97bcbcbebcbe37d9e36b414a3eb371fdc0f4450 [revision 889] Author: Loren Merritt Date: Thu Jun 12 03:00:23 2008 -0600 mc_chroma_sse2/ssse3 commit 6ec1bd732c5eb73c9e303b8e7e3963c80044aa94 [revision 888] Author: Loren Merritt Date: Thu Jun 12 08:43:41 2008 -0600 checkasm --bench=function_name commit 473140b265c8865c4089cc8a78352dff4b4bc1f6 [revision 887] Author: Loren Merritt Date: Thu Jun 12 01:39:22 2008 -0600 interleave psnr/ssim computation with reference frame filtering, to improve cache coherency commit 2a7dd58c68fda378a5e8b68184ff56daee9f9019 [revision 886] Author: Fiona Glaser Date: Sun Jun 15 11:59:25 2008 -0600 Add more inline asm and a runtime check for MMXEXT support x264 will now terminate gracefully rather than SIGILL when run on a machine with no MMXEXT support. A configure option is now available to build x264 without assembly support for support on such old CPUs as the Pentium 2, K6, etc. commit 56108cb63848d4a553bccb7389226910f3f25e2e [revision 885] Author: Fiona Glaser Date: Sun Jun 15 11:51:36 2008 -0600 Use aligned memcpy for x264_me_t struct and cosmetics commit dba0e5a2e089cd675e201cdf4e3358eb7a0e22cc [revision 884] Author: Fiona Glaser Date: Sun Jun 15 11:50:17 2008 -0600 Cosmetics and loop unrolling GCC is not very good at loop unrolling in cases where it can perform constant propagation, so the unrolling unfortunately has to be done manually. commit d108f91937cdb67b2bfa4f6e7fc1cf6b776febbf [revision 883] Author: Fiona Glaser Date: Thu Jun 12 09:17:49 2008 -0600 Fix regression in 64-bit in r882 i_mvc needs to be 64-bit when used with a 64-bit memory pointer commit 5204112861581df847a4a892ea63b8a0d72f2e6c [revision 882] Author: Fiona Glaser Date: Thu Jun 12 08:09:22 2008 -0600 More tweaks to me.c Added inline MMX version of UMH's predictor difference test Various cosmetics throughout me.c Removed a C99-ism introduced in r878. commit d4e077867f79a555efb83e45d93dc6f170b1fb3e [revision 881] Author: Fiona Glaser Date: Wed Jun 11 18:23:00 2008 -0600 Fix regression in r736 r736 added intra RD refinement to B-frames; however, it is possible for subme=7 to be used without b-rdo. This means intra RD isn't run, and therefore it is possible for intra chroma analysis to not have been run, since update_cache was never called for an intra block, and chroma ME is not required even at subme=7. r801, which removed a memset, made this worse because previously the chroma prediction mode was at least initialized to zero; now it was not initialized at all. Therefore, --no-chroma-me, --subme 7, and no --b-rdo had the potential to crash. This change restricts intra RD refinement to only be run when --b-rdo is enabled (sensible to begin with), thus preventing a crash in this case. commit 3a095b2ce5c30eea665f0e6fb44ba2b3510adf65 [revision 880] Author: Fiona Glaser Date: Tue Jun 10 21:37:57 2008 -0600 Fix regression in r850 Bug resulted in rare incorrect chroma encoding commit 5e59162c0ab7cafec85ba7c8cf648d7300cdc860 [revision 879] Author: Gabriel Bouvigne Date: Tue Jun 10 18:40:52 2008 -0600 Cosmetics in VBV handling commit d4a4b3f168251a9474f2a945e859e7813a7a3120 [revision 878] Author: Fiona Glaser Date: Tue Jun 10 18:34:46 2008 -0600 Tweaks and cosmetics in me.c Use write-combining for predictor checking and other tweaks. commit 9cc180ac4a79cae85790c1eeefa692d4f12b5232 [revision 877] Author: Fiona Glaser Date: Fri Jun 6 14:59:10 2008 -0600 Partially inline trellis quantization Inlining trellis into the 4x4/8x8 trellis wrappers increases trellis speed by about 5-10% through constant propagation. commit 3d9b6b3ce55dce861d8b64f832fa40dfe67d6bca [revision 876] Author: Fiona Glaser Date: Fri Jun 6 12:32:57 2008 -0600 Various cosmetic changes. commit 49ce3ac63b5305ca28f65bd75e6a4e6540d5954a [revision 875] Author: Fiona Glaser Date: Fri Jun 6 22:57:33 2008 -0600 avg_weight_sse2 commit c0c0e1f48de74acec0b681bfa842d3c8cddb4a32 [revision 874] Author: Loren Merritt Date: Fri Jun 6 23:31:22 2008 -0600 many changes to which asm functions are enabled on which cpus. with Phenom, 3dnow is no longer equivalent to "sse2 is slow", so make a new flag for that. some sse2 functions are useful only on Core2 and Phenom, so make a "sse2 is fast" flag for that. some ssse3 instructions didn't become useful until Penryn, so yet another flag. disable sse2 completely on Pentium M and Core1, because it's uniformly slower than mmx. enable some sse2 functions on Athlon64 that always were faster and we just didn't notice. remove mc_luma_sse3, because the only cpu that has lddqu (namely Pentium 4D) doesn't have "sse2 is fast". don't print mmx1, sse1, nor 3dnow in the detected cpuflags, since we don't really have any such functions. likewise don't print sse3 unless it's used (Pentium 4D). commit f9ad5ee2564bb272635f0c69fefa28e0b1b47f37 [revision 873] Author: Loren Merritt Date: Fri Jun 6 23:30:37 2008 -0600 enable ssse3 phadd satd on Penryn. commit b8670681bbe2312f1b2d1842bbd473223f005c69 [revision 872] Author: Loren Merritt Date: Fri Jun 6 22:59:37 2008 -0600 benchmark most of the asm functions (checkasm --bench). commit c24df7dae689d86e1d55137d343fc3589a75887d [revision 871] Author: Fiona Glaser Date: Thu Jun 5 11:32:05 2008 -0600 Cosmetic: fix C99-ism commit a6c98f6f5798e31634b47aee0a18d7ecd5eff3e1 [revision 870] Author: Fiona Glaser Date: Wed Jun 4 21:28:48 2008 -0600 Use a gaussian window for cplxblur Cplxblur was originally intended to use a gaussian window, but in its current form did not. This change provides a tiny improvement to 2pass ratecontrol. commit 970d61004fe559aa3e89e64185ee3b9efea53954 [revision 869] Author: Loren Merritt Date: Mon Jun 2 09:47:50 2008 -0600 cosmetics commit b5053542a8e89beff4fb4ac13c8030f35c2fd79d [revision 868] Author: Loren Merritt Date: Mon Jun 2 09:40:49 2008 -0600 nasm compatible NX stack commit 8c1ec12a747e8b9e5c4eebf786b8262286b9c965 [revision 867] Author: Loren Merritt Date: Mon Jun 2 08:57:59 2008 -0600 CQP is incompatible with AQ commit 9bdf19c2f114a439cc0f4d27ab8493912918584d [revision 866] Author: Fiona Glaser Date: Sat May 24 13:10:21 2008 -0600 memzero_aligned_mmx commit 579857968ab579b378f96d96c02aba68a2450367 [revision 865] Author: BugMaster Date: Sat May 24 01:09:07 2008 -0600 binmode stdin on mingw, not just msvc commit 71a919d44670fb4e1c4777d770c112a2f23f9b23 [revision 864] Author: Fiona Glaser Date: Fri May 23 21:22:29 2008 -0600 omit redundant mc after non-rdo dct size decision, and in b-direct rdo commit 53712c4bb01f9cc5c22d7eeb408f20d4f4d9520e [revision 863] Author: Fiona Glaser Date: Wed Apr 9 16:46:51 2008 -0600 allow fractional CRF values with AQ. commit 8a1d6cb266b8fa4f29725bca31c265253134fcc9 [revision 862] Author: Noboru Asai Date: Mon Jun 2 09:12:29 2008 -0600 fix some uninitialized partitions in rdo commit 56f2bc8950f5abaf20b1241511d1b02db3945f3d [revision 861] Author: Gabriel Bouvigne Date: Mon Jun 2 12:53:01 2008 -0600 2-pass VBV support and improved VBV handling Dramatically improves 1-pass VBV ratecontrol (especially CBR) and provides support for VBV in 2-pass mode. This consists of a series of functions that attempts to find overflows and underflows in the VBV from the first-pass statsfile and fix them before encoding. 1-pass VBV code partially by Fiona Glaser. commit 344cb1693dbe1471a3a94fef3156e94d684350de [revision 860] Author: Alexander Strange Date: Mon Jun 2 12:16:51 2008 -0600 Fix noise reduction in threaded mode. Previously enabling noise reduction with threads had no effect. Note that this is not an optimal solution; each thread still tracks noise reducation separately (unlike in single-threaded mode). commit 708b9862103947e687424dce8cbd9fade3e094b6 [revision 859] Author: Loren Merritt Date: Tue May 20 20:15:41 2008 -0600 fix a crash on win32 with threads. r852 introduced an assumption in deblock that the stack is aligned. commit 1851df553d6a5983d9db7a83e6cf922e7be0b5bb [revision 858] Author: Loren Merritt Date: Tue May 20 03:58:08 2008 -0600 remove nasm version check. a feature check is all that's needed. silence stderr in yasm version check. commit d4e6d802a4fd36dd1b4c0d15907660a20233ff47 [revision 857] Author: Loren Merritt Date: Sun May 18 08:33:34 2008 -0600 cosmetics in cabac commit 764a012365e23d35f54f103fb174a5b4319d5fed [revision 856] Author: Fiona Glaser Date: Sun May 18 07:14:28 2008 -0600 faster residual_write_cabac commit 92b3ea8c24f5932a3535cac71e8d9260e5a8e198 [revision 855] Author: Loren Merritt Date: Sun May 18 06:23:57 2008 -0600 change DEBUG_DUMP_FRAME to run-time --dump-yuv commit cb4dc4aee33f679ba4f73010c9b88f5d54799740 [revision 854] Author: Loren Merritt Date: Sat May 17 03:39:59 2008 -0600 x264_median_mv_mmxext this is the first non-runtime-detected use of mmxext, but it has to be inlined commit b594e8f9e4ff44d797820b3020e9d3a179843e50 [revision 853] Author: Loren Merritt Date: Sun Apr 20 03:18:19 2008 -0600 factor duplicated code out of deblock chroma mmx commit ffd9196b0f62edd09b3a581c8acc1072c1ddfaf0 [revision 852] Author: Loren Merritt Date: Tue Apr 15 17:52:32 2008 -0600 deblock_luma_intra_mmx commit 20f7ae51a4ac13b6e38a0edb1717b56733ae68c7 [revision 851] Author: vmrsss Date: Sat May 17 00:50:22 2008 -0600 write aspect ratio in mp4 commit d5d07b1823292e65572466577689819cf3bb98ec [revision 850] Author: Fiona Glaser Date: Thu May 15 22:44:12 2008 -0600 omit delta_quant in i16x16 blocks with no residual (all other block types were already covered, but i16x16 cbp is special) commit bfa2eac7fdc92eaf27004ef66e93898ec27f61f1 [revision 849] Author: Fiona Glaser Date: Thu May 15 06:01:01 2008 -0600 explicit write combining, because gcc fails at optimizing consecutive memory accesses commit 32bd2d645c63c7cf55a2f9b33e39e63144c3e835 [revision 848] Author: Fiona Glaser Date: Thu May 15 05:41:43 2008 -0600 force unroll macroblock_load_pic_pointers and a few other minor optimizations commit 2d816a51e2ed594f3e98515ea5f427d08a4df638 [revision 847] Author: Fiona Glaser Date: Thu May 15 05:14:53 2008 -0600 quant_2x2_dc_ssse3 commit 0bb9b6b8bb2853ccf18553f29e220e583f712f42 [revision 846] Author: Fiona Glaser Date: Sat May 17 00:47:31 2008 -0600 r836 borked lossless cabac nnz commit 08ad421f4b79eafafa100e95a28d861f07dfaed4 [revision 845] Author: Henry Bent Date: Wed May 7 19:49:14 2008 -0600 use elf instead of a.out on netbsd commit a0194ef6806ac3249c9add16f82dc4a38cf2680e [revision 844] Author: Ning Xin Date: Wed May 7 17:18:44 2008 -0600 fix x264_realloc when not using libc realloc. commit 1baca94c8da547ebc3363678da6fe09556b97658 [revision 843] Author: Loren Merritt Date: Mon May 5 16:28:24 2008 -0600 don't pretend to support win64. remove all related code. it hasn't worked since probably some time in 2005, and won't ever be fixed unless someone steps up to maintain it. commit d09b8e9155b67dd5554b211035e34ac54f8b24c1 [revision 842] Author: Loren Merritt Date: Mon May 5 16:25:19 2008 -0600 cosmetics: replace last instances of parm# asm macros with r# commit 709093dfde92120131f69dd3780a66e01e5d3d67 [revision 841] Author: Loren Merritt Date: Mon Apr 28 03:12:29 2008 -0600 remove DEBUG_BENCHMARK commit 108897fde1a1713ff2546a1dc4e998e1b8b95f44 [revision 840] Author: Fiona Glaser Date: Sun Apr 27 03:10:28 2008 -0600 faster probe_skip commit ad6c91f064e6e6ceab3b876713006e5e1fb3f574 [revision 839] Author: Loren Merritt Date: Tue Apr 22 17:16:25 2008 -0600 drop support for pre-SSE3 assemblers commit 27ae7576cf0a978317fc9c1be3fc3b562338a7c4 [revision 838] Author: Loren Merritt Date: Fri Apr 25 00:33:12 2008 -0600 s/x264_cpu_restore/x264_emms/ no point in giving it a generic name when it's not generic commit 495463e3f7ddbd643a4bfb8475bf3dfbe4fb4bf9 [revision 837] Author: Fiona Glaser Date: Sun Apr 27 02:37:37 2008 -0600 faster cabac_mb_cbp_luma ported from ffmpeg commit 36f80085d73652cbddfeb9de92ec6e41e6b6d34f [revision 836] Author: Fiona Glaser Date: Fri Apr 25 21:41:40 2008 -0600 remove some redundant nnz counts move some nnz counts from macroblock_encode to cavlc if cabac doesn't need them commit 79f03a3ba1fa908b9044845d8b52b376997c74e9 [revision 835] Author: Fiona Glaser Date: Fri Apr 25 20:43:57 2008 -0600 compute missing nnz count in subme7 cavlc commit 20720d6b7e2e0406e29a121d5340cd5199083f44 [revision 834] Author: Fiona Glaser Date: Fri Apr 25 01:47:47 2008 -0600 remove a division in macroblock-level bookkeeping commit 03da01e43fbccb14e054bab2464e594991e5108f [revision 833] Author: Fiona Glaser Date: Thu Apr 24 18:55:30 2008 -0600 omit P/B-skip mc from macroblock_encode if the pixels haven't been overwritten since probe_skip commit e0f13712fd496702f3f7c0cecfb043f0a6af9b3e [revision 832] Author: Loren Merritt Date: Thu Apr 24 05:17:04 2008 -0600 earlier termination in SEA if mvcost exceeds residual commit 2fe89852d1789865e8bea8fee18438d14ddeaf4e [revision 831] Author: Loren Merritt Date: Tue Apr 22 04:00:24 2008 -0600 remove void* arithmetic from r821 commit 8b6df37d8f1882bba3702aa3dafe7ae38bbd6b23 [revision 830] Author: Guillaume Poirier Date: Fri Apr 25 11:29:09 2008 +0200 Fix define of illegal function identifiers (as defined in section "7.1.3 Reserved identiers" of C99 spec) commit e0d72e3d4963e671626ed31e57221ebe45283d75 [revision 829] Author: Guillaume Poirier Date: Fri Apr 25 10:50:48 2008 +0200 Fix define of illegal identifier (as defined in section "7.1.3 Reserved identiers" of C99 spec) "__UNUSED__", and use the one defined in common/osdep.h, i.e. "UNUSED" based on a patch by Diego Biurrun commit a146ee3d8bb003e66ccf908ca83239c693068f48 [revision 828] Author: Guillaume Poirier Date: Thu Apr 24 14:46:11 2008 +0200 more consistent include name (in line with other PPC includes) commit 4ee9642aef80b64f7b726ad245e6ab2ea631e896 [revision 827] Author: Guillaume Poirier Date: Thu Apr 24 14:44:24 2008 +0200 fix illegal identifiers in multiple inclusion guards patch by Diego Biurrun % diego A biurrun P de % commit 34ed67475e0f08fc502ed287f918cf7d676f20bf [revision 826] Author: Fiona Glaser Date: Tue Apr 22 00:38:37 2008 -0600 AQ now treats perfectly flat blocks as low energy, rather than retaining previous block's QP. fixes occasional blocking in fades. commit 28a2d7af87ceb29b93e73c99406316c04b2c9f23 [revision 825] Author: Loren Merritt Date: Sun Apr 20 12:19:46 2008 -0600 checkasm cabac commit 1877d1c430ff6a167b76736cc3527efe19dc330c [revision 824] Author: Loren Merritt Date: Sun Apr 20 02:39:31 2008 -0600 s/movdqa/movaps/g commit 6df41d50d936c428e1e5239a2eded54ee13e9156 [revision 823] Author: Loren Merritt Date: Sun Apr 20 18:25:53 2008 -0600 --asm to allow testing of different versions of asm without recompile commit 87132ed66fd3887a0a16618272fcf97ed244f6af [revision 822] Author: Loren Merritt Date: Sat Apr 12 01:40:28 2008 -0600 copy left neighbor pixels directly from previous mb instead of main plane commit 6eb5483505f40bb319ce0afa052ee41543993fc1 [revision 821] Author: Fiona Glaser Date: Wed Apr 9 16:30:34 2008 -0600 cacheline split workaround for mc_luma commit c1e43f094095265a77c9584fbfc25209b62efc78 [revision 820] Author: Guillaume Poirier Date: Wed Apr 16 10:46:15 2008 +0200 add "SECTION_RODATA" before "SECTION .text" to setup the fakegot label used in macho binaries. This fixes compilation with --enable-pic Requires Yasm 0.7.0 or newer Patch by Dave Lee % davelee P com A gmail P com % commit 67813bbfbbb43aa65a15e659a5ea668c8d8cb26c [revision 819] Author: Loren Merritt Date: Sun Apr 13 10:29:15 2008 -0600 more hpel fixes commit 0acaad1b446fbe76e7dc6924f3005d3bf88f73ce [revision 818] Author: Gabriel Bouvigne Date: Thu Apr 10 08:59:19 2008 -0600 update msvc projectfile commit 25558461eae393552684aab8d847a065c73667e5 [revision 817] Author: Loren Merritt Date: Fri Apr 11 18:48:30 2008 -0600 r810 borked hpel_filter_sse2 on unaligned buffers commit 0f453853989c33919769b1a8376d7daf6acfccd2 [revision 816] Author: Loren Merritt Date: Thu Apr 10 03:17:53 2008 -0600 threads=auto on multicore now implies thread input, just like explicit thread numbers already did commit 539103a59525903e82a013d7a300d2a786d35d22 [revision 815] Author: Loren Merritt Date: Tue Apr 8 20:16:50 2008 -0600 dct4 sse2 commit 9168abfaf453622d8297ee049dc5951f93ad196c [revision 814] Author: Loren Merritt Date: Tue Apr 8 12:19:23 2008 -0600 faster x86_32 dct8 commit 56bf7565f2743e4fe85763388cb74b75b9bf41c5 [revision 813] Author: Loren Merritt Date: Mon Apr 7 10:22:03 2008 -0600 macros to deal with macros that permute their arguments commit 32ef8652729945bc7bcaf2b1d3e112f5a7530bc6 [revision 812] Author: Loren Merritt Date: Mon Apr 7 08:24:40 2008 -0600 mmx cachesplit sad of non-square sizes checked height instead of width commit afbcfdc2d2b74f318eb1cb1db7a6a711d4d115e5 [revision 811] Author: Loren Merritt Date: Fri Apr 4 01:07:40 2008 -0600 sfence after nontemporal stores commit 7bdaab607070f6c30eb919d9e22c073650ee2f70 [revision 810] Author: Loren Merritt Date: Wed Apr 2 11:22:43 2008 -0600 simplify hpel filter asm (move control flow to C) and add sse2, ssse3 versions commit 29899d84c3ca0e11f70a0aea8e6adf721e6bbfb2 [revision 809] Author: Loren Merritt Date: Thu Apr 3 20:46:36 2008 -0600 more mmx/xmm macros (mova, movu, movh) commit 937b792529e12884ec9a6e094f93628640f60ad2 [revision 808] Author: Fiona Glaser Date: Wed Apr 2 05:06:02 2008 -0600 improve handling of cavlc dct coef overflows support large coefs in high profile, and clip to allowed range in baseline/main commit bdfec13d6ac6e70f9a01b1a36b1398b800c58e0f [revision 807] Author: Guillaume Poirier Date: Mon Mar 31 10:50:45 2008 +0200 fix shared libs on MacOSX based on a patch by İsmail Dönmez commit 658b058609f6ae51b18d4b022d0398d4d11ff134 [revision 806] Author: Loren Merritt Date: Mon Mar 31 02:27:53 2008 -0600 typo in r803 commit def7e3aaf85502eba82efa9a13c176e87cff8268 [revision 805] Author: Loren Merritt Date: Sun Mar 30 18:08:28 2008 -0600 fix a crash on mp4 muxing with invalid params commit b59440f09b7eb7e6f30c1131d56843ee92e3751d [revision 804] Author: Fiona Glaser Date: Sun Mar 30 17:58:41 2008 -0600 variance-based psy adaptive quantization new options: --aq-mode --aq-strength AQ is enabled by default commit 8d8f3ea41ddb1c11baf018c9db58df3747f8697f [revision 803] Author: Zuxy Meng Date: Sat Mar 29 18:04:23 2008 -0600 fix naming of .dll on mingw commit 7e3ef7ce24f223ea1734a64585f747131d1d7b6b [revision 802] Author: Loren Merritt Date: Sat Mar 29 17:53:36 2008 -0600 don't distinguish between mingw and cygwin commit 05e91fb1c26bb42d1c124e7812fc2be0533eae6e [revision 801] Author: Fiona Glaser Date: Sat Mar 29 16:27:54 2008 -0600 remove a memset commit 6f44349a9a703cd673283810fe56febec9f76783 [revision 800] Author: Loren Merritt Date: Sat Mar 29 16:27:08 2008 -0600 typo. don't evaluate rd pskip when p16x16 found ref>0. commit 27b73b3b86524ec9b0bdf8310a55081898b408c0 [revision 799] Author: Loren Merritt Date: Sat Mar 29 20:42:51 2008 -0600 r784 borked lossless dc zigzag commit c1c00e6cc02f500de1b955897e60b0f16ebb0ddf [revision 798] Author: Loren Merritt Date: Tue Mar 25 07:31:51 2008 -0600 fix an arithmetic overflow that disabled SEA threshold after finding a mv with SAD < mvcost. commit 1c72e71929a6eabded391ec6f16ba66d3cac75f3 [revision 797] Author: Guillaume Poirier Date: Tue Mar 25 16:30:04 2008 +0000 fix hpel_filter_altivec picked up by checkasm Patch by Manuel %maaanuuu A gmx.net % and Noboru Asai % noboru P asai A gmail P com % commit 66a0c19d3659bbbc69decb88465f5957cf3611ef [revision 796] Author: Loren Merritt Date: Tue Mar 25 00:59:50 2008 -0600 faster residual commit 41cd480c2b0b83a939effe01e855b9099d1124eb [revision 795] Author: Loren Merritt Date: Mon Mar 24 21:31:46 2008 -0600 nasm doesn't like align(nop) in structs commit 727377bf7582bf4c69083f2bf94b4e6c5965032c [revision 794] Author: Loren Merritt Date: Mon Mar 24 19:25:19 2008 -0600 reduce the size of some cabac arrays commit c9e8cfed1fec1a3e69db676725178c27b80c9c90 [revision 793] Author: Fiona Glaser Date: Mon Mar 24 19:21:24 2008 -0600 use cabac context transition table from trellis in normal residual coding too commit a3e11cbf36d78cb6b4147e8f1a73ee7c9387397e [revision 792] Author: Fiona Glaser Date: Mon Mar 24 19:12:07 2008 -0600 rearrange cabac struct to reduce code size commit 9289e80611d89ad4050fa738dec9a530c8f4e3d4 [revision 791] Author: Fiona Glaser Date: Mon Mar 24 03:25:25 2008 -0600 higher precision RD lambda improves quality at QP<=12. commit aaced0861e76767a5c0ce24a94214a261d9eb459 [revision 790] Author: Loren Merritt Date: Mon Mar 24 01:56:31 2008 -0600 faster cabac_encode_ue_bypass commit 23e52ef3bbc0690fe55e49bf32c595adc0404878 [revision 789] Author: Loren Merritt Date: Sun Mar 23 22:14:18 2008 -0600 cabac asm. mostly because gcc refuses to use cmov. 28% faster than c on core2, 11% on k8, 6% on p4. commit ecc95abde1efbd7d3ea9a42475a628f31cb572ea [revision 788] Author: Loren Merritt Date: Sun Mar 23 22:08:07 2008 -0600 cosmetics in cabac commit 9e7cfc35e955c3693b5690c233cc0049be222bce [revision 787] Author: Loren Merritt Date: Sat Mar 22 20:25:06 2008 -0600 inline cabac_size_decision commit 542027fac9212ca1f6d24a39ebea779bfec91123 [revision 786] Author: Loren Merritt Date: Sat Mar 22 03:25:03 2008 -0600 cosmetics in DECLARE_ALIGNED commit 52fb83347c17f88ea523763223b555ff5f475698 [revision 785] Author: Loren Merritt Date: Sat Mar 22 03:06:18 2008 -0600 don't distinguish between luma4x4 and luma4x4ac commit b437d2d4c90056b1dcb4f3220234d06d03f3e9b4 [revision 784] Author: Loren Merritt Date: Sat Mar 22 02:46:31 2008 -0600 faster lossless zigzag commit 489555ed890812e16b0f6a14b86abe0e819ab513 [revision 783] Author: Loren Merritt Date: Sat Mar 22 03:14:33 2008 -0600 more alignment commit 7b0e2bde0aedbd174942bf8c2dc7378ef1f8418a [revision 782] Author: Loren Merritt Date: Sat Mar 22 01:49:52 2008 -0600 add tesa and lossless to fprofile commit 41a1a09f4e6f0bf5d30dbaa745adf554d5597b56 [revision 781] Author: Loren Merritt Date: Sat Mar 22 01:46:43 2008 -0600 cosmetics in residual_write commit 91b126573702ac2689462d857958a9a7579956b0 [revision 780] Author: Loren Merritt Date: Fri Mar 21 23:24:33 2008 -0600 remove unused bitstream reader commit 7822bab33a0471f01b8ff7d14eb7d843602953bd [revision 779] Author: Loren Merritt Date: Fri Mar 21 18:58:46 2008 -0600 cosmetics in quant asm commit 5d972bf3f220404f7e7bd7595f8c3a804191b35c [revision 778] Author: Loren Merritt Date: Fri Mar 21 18:46:29 2008 -0600 special case dequant for flat matrix commit f63770aa6540b9ced04043ae474b1d3b99f024f9 [revision 777] Author: Loren Merritt Date: Fri Mar 21 00:04:46 2008 -0600 faster dequant commit 263abc67c9c37d8cfddcb5bc9e9ddf768366429b [revision 776] Author: Loren Merritt Date: Thu Mar 20 22:08:07 2008 -0600 simplify hpel_filter_c commit ead697cad4c2090255fccecbded84346fd398075 [revision 775] Author: Loren Merritt Date: Thu Mar 20 19:35:54 2008 -0600 use x264_mc_copy_w16_sse2 in mc.copy, it was previously only in mc_luma commit 14b45a81c25be808e3da6d7b3e78051f6c5b5308 [revision 774] Author: Loren Merritt Date: Thu Mar 20 14:00:08 2008 -0600 new ssd_8x*_sse2 align ssd_16x*_sse2 unroll ssd_4x*_mmx commit 72869f7648732a0d39398068e6288730ad009135 [revision 773] Author: Manuel Rommel Date: Thu Mar 20 13:21:16 2008 -0600 update altivec zigzags commit dbfd2cc73b01dc3256a4b7bf65b8b0632fd73f96 [revision 772] Author: Loren Merritt Date: Thu Mar 20 10:41:50 2008 -0600 r768 borked cavlc commit ac49761ae664e86598609301d7ca25063a003151 [revision 771] Author: Loren Merritt Date: Thu Mar 20 00:52:11 2008 -0600 cosmetics in intra predict commit 5442dafdab1f18eb5d6f27813dbd0d8b1f37a300 [revision 770] Author: Fiona Glaser Date: Thu Mar 20 00:31:42 2008 -0600 faster intra predict 8x8 hu/hd commit 30da25a99e24e5c1ff5972b7f5c22c4be2a944b1 [revision 769] Author: Loren Merritt Date: Wed Mar 19 23:43:19 2008 -0600 reduce zigzag arrays from int to int16_t commit 7a125e4a89b6c1cfd5066706939b7dee5a755254 [revision 768] Author: Loren Merritt Date: Wed Mar 19 23:42:20 2008 -0600 reduce the size of some arrays commit 1d56ef44748dd3ae36751f27263ccefc22d5f543 [revision 767] Author: Fiona Glaser Date: Wed Mar 19 15:01:05 2008 -0600 skip intra pred+dct+quant in cases where it's redundant (analyse vs encode) large speedup with trellis=2, small speedup with trellis=0 and/or subme>=6 commit b10ee560293bfff1ac2e72d8a6a61fae4812a9a6 [revision 766] Author: Loren Merritt Date: Wed Mar 19 14:03:34 2008 -0600 cosmetics in asm commit a3b97adbcc84448c9c24520ede8188d6c99bf5cb [revision 765] Author: Fiona Glaser Date: Wed Mar 19 14:00:34 2008 -0600 satd_4x4_ssse3 commit 8773988471e5469ebd00841cccb4eee8bbdb54dd [revision 764] Author: Fiona Glaser Date: Wed Mar 19 13:40:41 2008 -0600 get_ref_sse2 commit 8727a01bf21a52224c5de130e1173de31062ab87 [revision 763] Author: Fiona Glaser Date: Tue Mar 18 19:17:22 2008 -0600 continue instead of crash when the threading mv constraint is violated. doesn't fix the underlying bug, but hopefully less annoying until we find it. commit b2d5df5fdf80e42fe03526f9d73983477f2013af [revision 762] Author: Loren Merritt Date: Tue Mar 18 18:24:01 2008 -0600 remove remaining reference to clip1.h commit 73b3fd48e592de96c05bdfe0cf7144c0da6ac650 [revision 761] Author: Loren Merritt Date: Tue Mar 18 12:34:10 2008 -0600 fix name mangling again. apparently it's not just a convention, dll build fails if you try to export a non-prefixed name. commit 1e829dbf23af6bda0103c15de11115461e9bc504 [revision 760] Author: Gabriel Bouvigne Date: Mon Mar 17 15:44:40 2008 -0600 update msvc projectfile commit 08c7dd46963096ba4d3e9187fdf14ec1338fc959 [revision 759] Author: Loren Merritt Date: Mon Mar 17 15:41:59 2008 -0600 missing #ifdef HAVE_SSE3 commit 0f51933971f6bb695ef88c03e2c3e76c61d1c95f [revision 758] Author: Loren Merritt Date: Mon Mar 17 15:41:30 2008 -0600 don't define offsetof since it's standard commit cfa08dc7ba39460a777b5161aa8024d85197b0d3 [revision 757] Author: Loren Merritt Date: Mon Mar 17 01:23:35 2008 -0600 shut up gcc warning in offsetof commit e56ea0861b650e1ee5f3951d786bfc5297183574 [revision 756] Author: Håkan Hjort Date: Mon Mar 17 01:20:02 2008 -0600 increase alignment of mv arrays commit 5469a4baaf379ab30119c625067be6dd23cb3bfe [revision 755] Author: Fiona Glaser Date: Sun Mar 16 23:58:04 2008 -0600 memcpy_aligned_sse2 commit 9d0c0a90254e39adb581158d37a6946064fce4e2 [revision 754] Author: Loren Merritt Date: Sun Mar 16 22:40:43 2008 -0600 checkasm check whether callee-saved regs are correctly saved x86_32 only for now since x86_64 varargs are annoying commit 1b1e12482da963334290bd157088427f37f4e2d3 [revision 753] Author: Loren Merritt Date: Sun Mar 16 22:28:20 2008 -0600 fix x86_32 ads which failed to preserve a register commit c82674fe244b8cd7828117f267a8fca2b62f7cfd [revision 752] Author: Loren Merritt Date: Sun Mar 16 16:34:41 2008 -0600 fix some name mangling issues introduced by the merge commit 20b4106bb404de6be488a979ffafad35b7b8f691 [revision 751] Author: Loren Merritt Date: Sun Mar 16 15:30:40 2008 -0600 remove x264_mc_clip1. it's wrong for sufficiently perverse inputs, and clip_uint8 is faster anyway. commit c17218e8a37ca1ed93a0852b73acc5d4cc046bb8 [revision 750] Author: Loren Merritt Date: Sun Mar 16 13:54:58 2008 -0600 merge x86_32 and x86_64 asm, with macros to abstract calling convention and register names commit 3445cca40e490cd11075051215c5d7c49477c7f7 [revision 749] Author: Loren Merritt Date: Sun Mar 9 05:58:55 2008 -0600 git compatible version script commit 8609ffa0dd7092509c0ec5c4c667ab6eea503fd7 [revision 748] Author: Loren Merritt Date: Sun Mar 2 17:53:01 2008 -0700 check for broken versions of yasm commit 3d5beaee8325c2788f17a632558759ec95ec76e6 [revision 747] Author: Loren Merritt Date: Sun Mar 2 17:27:38 2008 -0700 increase the alignment of the i8x8 edge cache, needed for sse2 intra prediction. patch by Alexander Strange. commit 25fd257d988cf7f4e00fa674e1e99417e4f7ef6e [revision 746] Author: Loren Merritt Date: Sun Mar 2 16:12:57 2008 -0700 .gitignore commit 9dce08ac53aa22695d1934bc321122863ac3739e [revision 745] Author: Loren Merritt Date: Sun Mar 2 03:04:07 2008 +0000 pic macros now keep track of which register holds the GOT, so variable access doesn't have to care git-svn-id: svn://svn.videolan.org/x264/trunk@745 df754926-b1dd-0310-bc7b-ec298dee348c commit 264f13aeaf52c7c8c38a35ab781561c4692e251e [revision 744] Author: Loren Merritt Date: Sun Mar 2 02:27:45 2008 +0000 remove x86_64 predict_8x8_ddl_mmxext because sse2 is faster even on amd git-svn-id: svn://svn.videolan.org/x264/trunk@744 df754926-b1dd-0310-bc7b-ec298dee348c commit 315285741877f89c660b9cefc3114963e95cf56a [revision 743] Author: Loren Merritt Date: Sun Mar 2 02:26:00 2008 +0000 cosmetics in dsp init git-svn-id: svn://svn.videolan.org/x264/trunk@743 df754926-b1dd-0310-bc7b-ec298dee348c commit 564cc252b099ad12e9c33dd9404ad64ed0bc5b8f [revision 742] Author: Loren Merritt Date: Sun Mar 2 02:11:12 2008 +0000 sse2 16x16 intra pred. port the remaining intra pred functions from x86_64 to x86_32. patch by Fiona Glaser. git-svn-id: svn://svn.videolan.org/x264/trunk@742 df754926-b1dd-0310-bc7b-ec298dee348c commit c48882dd3f9a51b64c2d129f604bedb79d140626 [revision 741] Author: Loren Merritt Date: Sat Mar 1 13:47:05 2008 +0000 some simplifications to mmx intra pred that should have been done way back when we switched to constant fdec_stride. and remove pic spills in functions that have a free caller-saved reg. patch partly by Fiona Glaser. git-svn-id: svn://svn.videolan.org/x264/trunk@741 df754926-b1dd-0310-bc7b-ec298dee348c commit 78ec787957d66996336224fde5e0bec38bf11b3c [revision 740] Author: Loren Merritt Date: Sat Mar 1 07:30:34 2008 +0000 faster array_non_zero git-svn-id: svn://svn.videolan.org/x264/trunk@740 df754926-b1dd-0310-bc7b-ec298dee348c commit 405d9c668463f784301b705633d7723638e068e3 [revision 739] Author: Loren Merritt Date: Sat Mar 1 04:33:24 2008 +0000 x86_32 sse2 idct8 ported from ffmpeg by Fiona Glaser git-svn-id: svn://svn.videolan.org/x264/trunk@739 df754926-b1dd-0310-bc7b-ec298dee348c commit b6d6d2324acc160ac00267da14b21166d89cff92 [revision 738] Author: Loren Merritt Date: Sat Mar 1 04:13:55 2008 +0000 checkasm: relax the threshold for floating-point ssim git-svn-id: svn://svn.videolan.org/x264/trunk@738 df754926-b1dd-0310-bc7b-ec298dee348c commit 68399d5f330bd7307d9b8fd4fc4a63f2c5540009 [revision 737] Author: Loren Merritt Date: Sat Mar 1 04:07:44 2008 +0000 checkasm: test idct with the range of coefficients what can really be encountered, as opposed to random numbers which might overflow. git-svn-id: svn://svn.videolan.org/x264/trunk@737 df754926-b1dd-0310-bc7b-ec298dee348c commit bff0357aaea8eda709e46e9a3f8c38d110ecf8a6 [revision 736] Author: Loren Merritt Date: Mon Jan 28 14:33:42 2008 +0000 intra_rd_refine in B-frames git-svn-id: svn://svn.videolan.org/x264/trunk@736 df754926-b1dd-0310-bc7b-ec298dee348c commit 0289fc10a7522d6d91ae2906f053ad26b775d8d7 [revision 735] Author: Loren Merritt Date: Sun Jan 27 16:29:54 2008 +0000 print average of macroblock QPs instead of frame's nominal QP git-svn-id: svn://svn.videolan.org/x264/trunk@735 df754926-b1dd-0310-bc7b-ec298dee348c commit a783a84024df94041b9c2805bc94c0bc343f2b1a [revision 734] Author: Loren Merritt Date: Sun Jan 27 16:16:37 2008 +0000 update date git-svn-id: svn://svn.videolan.org/x264/trunk@734 df754926-b1dd-0310-bc7b-ec298dee348c commit ecbc00bfb735824419101f8e2186bdae18893c89 [revision 733] Author: Loren Merritt Date: Sun Jan 27 16:06:31 2008 +0000 remove colorspace conversion support, because it has no business in any codec git-svn-id: svn://svn.videolan.org/x264/trunk@733 df754926-b1dd-0310-bc7b-ec298dee348c commit 44f5e6bda9646eea70d57effdb93255721a475a3 [revision 732] Author: Loren Merritt Date: Sun Jan 27 14:01:40 2008 +0000 misc fixes in checkasm git-svn-id: svn://svn.videolan.org/x264/trunk@732 df754926-b1dd-0310-bc7b-ec298dee348c commit 1b59c20789dd95c06074bd521d7b1f15fa139b07 [revision 731] Author: Loren Merritt Date: Sun Jan 27 13:39:09 2008 +0000 remove a useless bit of me=umh (originally copied from JM, where it was used for something) git-svn-id: svn://svn.videolan.org/x264/trunk@731 df754926-b1dd-0310-bc7b-ec298dee348c commit 330e7364c764a18ebba95f26cd638b7992498e1e [revision 730] Author: Loren Merritt Date: Sun Jan 27 11:50:50 2008 +0000 fix a memleak in cqm git-svn-id: svn://svn.videolan.org/x264/trunk@730 df754926-b1dd-0310-bc7b-ec298dee348c commit 9a8218ca82fe1027487651c4fc80c262ff935699 [revision 729] Author: Loren Merritt Date: Sun Jan 27 11:49:16 2008 +0000 fix a memleak in mkv muxer patch by saintdev git-svn-id: svn://svn.videolan.org/x264/trunk@729 df754926-b1dd-0310-bc7b-ec298dee348c commit 8d09ebe2e862688ce213d3f098ce7eca719fea23 [revision 728] Author: Loren Merritt Date: Sun Jan 27 11:36:11 2008 +0000 satd exhaustive motion search (--me tesa) git-svn-id: svn://svn.videolan.org/x264/trunk@728 df754926-b1dd-0310-bc7b-ec298dee348c commit 12c833c863c0e119212977d25723431bc66088a4 [revision 727] Author: Loren Merritt Date: Sun Jan 27 11:09:52 2008 +0000 fix cabac context for nonzero delta_qp of the 2nd mb of a frame in interlaced mode git-svn-id: svn://svn.videolan.org/x264/trunk@727 df754926-b1dd-0310-bc7b-ec298dee348c commit 3f54ed167a5546bd9b8639285090c71a82bbffc4 [revision 726] Author: Loren Merritt Date: Sun Jan 27 10:32:36 2008 +0000 fix mapping of mvs to partitions in p4x4_chroma patch by Noboru Asai git-svn-id: svn://svn.videolan.org/x264/trunk@726 df754926-b1dd-0310-bc7b-ec298dee348c commit fa8df0155128105e59155a38d3dcba8e5e765e16 [revision 725] Author: Loren Merritt Date: Sun Jan 27 10:12:24 2008 +0000 fix mvp for b16x8 and b8x16 L1 search patch by Wei-Yin Chen git-svn-id: svn://svn.videolan.org/x264/trunk@725 df754926-b1dd-0310-bc7b-ec298dee348c commit f98a8e20888847527edcff6a9244de2dc714e42c [revision 724] Author: Loren Merritt Date: Sun Jan 27 10:05:20 2008 +0000 shave a couple cycles off cabac functions git-svn-id: svn://svn.videolan.org/x264/trunk@724 df754926-b1dd-0310-bc7b-ec298dee348c commit 75c77579a5e6386250aa85335408d5d8ed475df4 [revision 723] Author: Loren Merritt Date: Sun Jan 27 09:12:39 2008 +0000 faster and smaller x264_macroblock_cache_mv etc git-svn-id: svn://svn.videolan.org/x264/trunk@723 df754926-b1dd-0310-bc7b-ec298dee348c commit dc95c0f4d30e5289442b1679bf674b92f18f5083 [revision 722] Author: Loren Merritt Date: Sun Jan 27 09:11:01 2008 +0000 configure test for endianness git-svn-id: svn://svn.videolan.org/x264/trunk@722 df754926-b1dd-0310-bc7b-ec298dee348c commit e644e7eaece255a711a9a2eff5e708cd9168bb71 [revision 721] Author: Loren Merritt Date: Fri Jan 18 00:42:38 2008 +0000 change the meaning of --ref: it now selects DPB size (including B-frames), rather than L0 size (which B-frames are added to) git-svn-id: svn://svn.videolan.org/x264/trunk@721 df754926-b1dd-0310-bc7b-ec298dee348c commit 8105162bf7fba8b7d467d5a185db25e4696d74d6 [revision 720] Author: Guillaume Poirier Date: Mon Jan 14 09:54:33 2008 +0000 add / fix support for FreeBSD, based on a patch by Igor Mozolevsky % igor A hybrid-lab P co P uk % git-svn-id: svn://svn.videolan.org/x264/trunk@720 df754926-b1dd-0310-bc7b-ec298dee348c commit 85a9209e7b85e7a38fa3770f963dc9535fe4a19c [revision 719] Author: Loren Merritt Date: Wed Jan 9 11:25:09 2008 +0000 shut up some valgrind warnings git-svn-id: svn://svn.videolan.org/x264/trunk@719 df754926-b1dd-0310-bc7b-ec298dee348c commit ab69c14b91fc4fcdd15b5449fb549c05fe94e9a8 [revision 718] Author: Loren Merritt Date: Tue Jan 8 18:10:51 2008 +0000 slightly wrong memory allocation in r717, fixes a potential crash with merange>32 git-svn-id: svn://svn.videolan.org/x264/trunk@718 df754926-b1dd-0310-bc7b-ec298dee348c commit 2ed861c86065fd556a6f7b18718725c1fc04452b [revision 717] Author: Loren Merritt Date: Sun Jan 6 08:15:04 2008 +0000 convert absolute difference of sums from mmx to sse2 convert mv bits cost and ads threshold from C to sse2 convert bytemask-to-list from C to scalar asm 1.6x faster me=esa (x86_64) or 1.3x faster (x86_32). (times consider only motion estimation. overall encode speedup may vary.) git-svn-id: svn://svn.videolan.org/x264/trunk@717 df754926-b1dd-0310-bc7b-ec298dee348c commit c0fb035a73e93744ad443e0086c810e9bef38232 [revision 716] Author: Loren Merritt Date: Sun Jan 6 08:06:36 2008 +0000 round esa range to a multiple of 4 git-svn-id: svn://svn.videolan.org/x264/trunk@716 df754926-b1dd-0310-bc7b-ec298dee348c commit 644878dc39b22f41c447705c0f70c0b728204068 [revision 715] Author: Guillaume Poirier Date: Thu Jan 3 22:24:38 2008 +0000 use define _WIN32 instead of __WIN32__ or WIN32 defines. NSDN reference: http://msdn2.microsoft.com/en-us/library/b0084kay(VS.80).aspx Patch by BugMaster %BugMaster A narod P ru% Original thread: date: Dec 27, 2007 3:18 AM subject: [x264-devel] VS2008 compilation error (need of replacement __WIN32__ with _WIN32) git-svn-id: svn://svn.videolan.org/x264/trunk@715 df754926-b1dd-0310-bc7b-ec298dee348c commit 04e389d4d9ed2b1ed853822e64bf9c4db71ac841 [revision 714] Author: Loren Merritt Date: Fri Dec 21 01:57:14 2007 +0000 tweak x264_pixel_sad_x4_16x16_sse2 horizontal sum. 168 -> 166 cycles on core2. git-svn-id: svn://svn.videolan.org/x264/trunk@714 df754926-b1dd-0310-bc7b-ec298dee348c commit 665c02975800d1f6d44d2250f869ccfe78405c19 [revision 713] Author: Loren Merritt Date: Thu Dec 20 19:24:17 2007 +0000 fix a nondeterminism involving 8x8dct, rdo, and threads. git-svn-id: svn://svn.videolan.org/x264/trunk@713 df754926-b1dd-0310-bc7b-ec298dee348c commit 55b152e408df09b011b1d60519f890746de64aa1 [revision 712] Author: Guillaume Poirier Date: Thu Dec 13 15:43:41 2007 +0000 also test arch-specific x264_zigzag_* implementations in checkasm.c patch by Patch by Noboru Asai % noboru P asai A gmail P com% git-svn-id: svn://svn.videolan.org/x264/trunk@712 df754926-b1dd-0310-bc7b-ec298dee348c commit e2eb874c1dfd15e62a3801af7e900baebf33746e [revision 711] Author: Guillaume Poirier Date: Mon Dec 10 22:09:13 2007 +0000 Add AltiVec implementation of - x264_zigzag_scan_4x4_frame_altivec() - x264_zigzag_scan_4x4ac_frame_altivec() - x264_zigzag_scan_4x4_field_altivec() - x264_zigzag_scan_4x4ac_field_altivec() each around 1.3 tp 1.8x faster than C version Patch by Noboru Asai % noboru P asai A gmail P com% git-svn-id: svn://svn.videolan.org/x264/trunk@711 df754926-b1dd-0310-bc7b-ec298dee348c commit adba3e534e72eb585982f52ef0c521a26b9fb90a [revision 710] Author: Guillaume Poirier Date: Sun Dec 9 15:50:52 2007 +0000 adds AliVec implementation of predict_16x16_p() over 4x faster than C version git-svn-id: svn://svn.videolan.org/x264/trunk@710 df754926-b1dd-0310-bc7b-ec298dee348c commit e241f028f34c84321213ad6436748e4769e289c5 [revision 709] Author: Loren Merritt Date: Tue Dec 4 21:56:18 2007 +0000 revert the x86_32 part of r708. elf shared libraries aren't important enough to be worth the extra lines of code to check for nasm. git-svn-id: svn://svn.videolan.org/x264/trunk@709 df754926-b1dd-0310-bc7b-ec298dee348c commit 6d185b463056b2cc589aa25c5ed598e63d48dc03 [revision 708] Author: Loren Merritt Date: Mon Dec 3 01:17:23 2007 +0000 mark asm functions as hidden git-svn-id: svn://svn.videolan.org/x264/trunk@708 df754926-b1dd-0310-bc7b-ec298dee348c commit 316150357332e6adcdaeea45950e1f5e94d0b7dc [revision 707] Author: Loren Merritt Date: Mon Dec 3 01:16:57 2007 +0000 check whether ld supports -Bsymbolic before using it git-svn-id: svn://svn.videolan.org/x264/trunk@707 df754926-b1dd-0310-bc7b-ec298dee348c commit 9784fa625ff0ccc6c25e52c18d8858d15584ab44 [revision 706] Author: Loren Merritt Date: Sun Dec 2 15:57:43 2007 +0000 reduce the data type used in some tables. 16KB smaller exe. git-svn-id: svn://svn.videolan.org/x264/trunk@706 df754926-b1dd-0310-bc7b-ec298dee348c commit e93dcb07eb6937550dee8cf4cd324a99fa7dae2c [revision 705] Author: Loren Merritt Date: Sat Dec 1 18:03:16 2007 +0000 faster removal of duplicate mv predictors git-svn-id: svn://svn.videolan.org/x264/trunk@705 df754926-b1dd-0310-bc7b-ec298dee348c commit 5d49ebd28eab0597fe74ce4d6fdffe768b145442 [revision 704] Author: Loren Merritt Date: Sat Dec 1 15:17:19 2007 +0000 avoid a division in x264_mb_predict_mv_ref16x16. patch by Fiona Glaser. git-svn-id: svn://svn.videolan.org/x264/trunk@704 df754926-b1dd-0310-bc7b-ec298dee348c commit 98f69c4af8834fd7c05f4e3fe9154288610fd912 [revision 703] Author: Loren Merritt Date: Sat Dec 1 02:58:34 2007 +0000 avoid a division in umh. patch by Fiona Glaser. git-svn-id: svn://svn.videolan.org/x264/trunk@703 df754926-b1dd-0310-bc7b-ec298dee348c commit 0642e139bffbf180f293ea4a94e2576c6963f5dd [revision 702] Author: Loren Merritt Date: Mon Nov 26 11:44:37 2007 +0000 fix a memleak in h->mb.mvr git-svn-id: svn://svn.videolan.org/x264/trunk@702 df754926-b1dd-0310-bc7b-ec298dee348c commit 290de9638e5364c37316010ac648a6c959f6dd26 [revision 701] Author: Loren Merritt Date: Sun Nov 25 12:38:19 2007 +0000 fix compilation as a shared library on x86_64 (regression in r696) git-svn-id: svn://svn.videolan.org/x264/trunk@701 df754926-b1dd-0310-bc7b-ec298dee348c commit e5479d7c58110a219b875889449bb3dc027735ea [revision 700] Author: Guillaume Poirier Date: Wed Nov 21 18:30:49 2007 +0000 add support for x86_64 on Darwin9.0 (Mac OS X 10.5, aka Leopard) Patch by Antoine Gerschenfeld %gerschen A clipper P ens P fr% git-svn-id: svn://svn.videolan.org/x264/trunk@700 df754926-b1dd-0310-bc7b-ec298dee348c commit bc77b5b91801325b0e36ac300535811658104690 [revision 699] Author: Loren Merritt Date: Wed Nov 21 11:52:19 2007 +0000 cover some more options in fprofile. (esa, bime, cqm, nr, no-dct-decimate, trellis2) previously, esa was slower with fprofile than without, since gcc thought it wasn't important. now esa benefits like anything else. git-svn-id: svn://svn.videolan.org/x264/trunk@699 df754926-b1dd-0310-bc7b-ec298dee348c commit 0291a044fa4334173c3129d80bf0e893a6d143d7 [revision 698] Author: Guillaume Poirier Date: Tue Nov 20 18:22:03 2007 +0000 Add AltiVec implementation of x264_pixel_ssd_8x8, 3x faster than C version Overall speed-up: 0.7% with --bframes 3 --ref 5 -m 7 --b-rdo Patch by Noboru Asai %noboru P asai A gmail P com% git-svn-id: svn://svn.videolan.org/x264/trunk@698 df754926-b1dd-0310-bc7b-ec298dee348c commit 9abaaaece8e7a4495d2b22bd47c39c3d81172610 [revision 697] Author: Loren Merritt Date: Tue Nov 20 08:53:26 2007 +0000 limit mvs to [-512,511.75] instead of [-512,512] git-svn-id: svn://svn.videolan.org/x264/trunk@697 df754926-b1dd-0310-bc7b-ec298dee348c commit d4ebafa5d7db55e0d21c633c25d4835c5b94e3fd [revision 696] Author: Loren Merritt Date: Tue Nov 20 06:07:17 2007 +0000 avoid memory loads that span the border between two cachelines. on core2 this makes x264_pixel_sad an average of 2x faster. other intel cpus gain various amounts. amd are unaffected. overall speedup: 1-10%, depending on how much time is spent in fullpel motion estimation. git-svn-id: svn://svn.videolan.org/x264/trunk@696 df754926-b1dd-0310-bc7b-ec298dee348c commit 125e0a84e04d04ac2dde69e091a75295f35120bc [revision 695] Author: Loren Merritt Date: Tue Nov 20 05:57:29 2007 +0000 add cache info to cpu_detect. also print sse3. git-svn-id: svn://svn.videolan.org/x264/trunk@695 df754926-b1dd-0310-bc7b-ec298dee348c commit b5d6d038ce4c643c6e6b096890728aba8c8db35c [revision 694] Author: Loren Merritt Date: Mon Nov 19 17:10:57 2007 +0000 cosmetics: reorder mc_luma/mc_chroma/get_ref arguments for consistency with other functions git-svn-id: svn://svn.videolan.org/x264/trunk@694 df754926-b1dd-0310-bc7b-ec298dee348c commit 94c55d8cc75ea88987ec6766d83ea0dd0aa7384f [revision 693] Author: Loren Merritt Date: Mon Nov 19 17:08:07 2007 +0000 separate pixel_avg into cases for mc and for bipred git-svn-id: svn://svn.videolan.org/x264/trunk@693 df754926-b1dd-0310-bc7b-ec298dee348c commit fcbd7e0c80d30dc7f0b098e704038e015f536be5 [revision 692] Author: Guillaume Poirier Date: Sun Nov 18 23:58:18 2007 +0000 add AltiVec implementation of ssim_4x4x2_core, about 4x faster than C version. Overall: 0.1-0.2% faster with default encoding settings Patch by Noboru Asai %noboru P asai A gmail P com% git-svn-id: svn://svn.videolan.org/x264/trunk@692 df754926-b1dd-0310-bc7b-ec298dee348c commit c0cd142e71575907c570ce6606e38b726e36d10e [revision 691] Author: Guillaume Poirier Date: Sun Nov 18 23:47:41 2007 +0000 Add AltiVec implementation ofx264_hpel_filter. Provides a 10-11% overall speed-up with default encoding options Patch by Noboru Asai %noboru P asai A gmail P com% git-svn-id: svn://svn.videolan.org/x264/trunk@691 df754926-b1dd-0310-bc7b-ec298dee348c commit a6e82dfff168a77d7801c08e61918b3e91db84cc [revision 690] Author: Loren Merritt Date: Sun Nov 18 01:45:44 2007 +0000 cosmetics in dsp function selection git-svn-id: svn://svn.videolan.org/x264/trunk@690 df754926-b1dd-0310-bc7b-ec298dee348c commit 2cd3c7b140e7bc60d902e4e7a049c17e82c02f6e [revision 689] Author: Loren Merritt Date: Sat Nov 17 10:21:46 2007 +0000 remove sad_pde. it's been unused ever since successive elimination replaced it. git-svn-id: svn://svn.videolan.org/x264/trunk@689 df754926-b1dd-0310-bc7b-ec298dee348c commit d9891656c3c376eb84fec3f85ebdee385d4de1f2 [revision 688] Author: Loren Merritt Date: Fri Nov 16 10:27:14 2007 +0000 cosmetics: use symbolic constants for frame padding radius git-svn-id: svn://svn.videolan.org/x264/trunk@688 df754926-b1dd-0310-bc7b-ec298dee348c commit 7b8c02d57b9b73a57ea7e58dd74b072197def2c7 [revision 687] Author: Loren Merritt Date: Fri Nov 16 09:17:58 2007 +0000 move hpel_filter cpu detection to a function pointer like everything else git-svn-id: svn://svn.videolan.org/x264/trunk@687 df754926-b1dd-0310-bc7b-ec298dee348c commit a79aac149ff148a6e32c7a305bac224490866a8f [revision 686] Author: Loren Merritt Date: Thu Nov 15 10:50:37 2007 +0000 cosmetics: use separate variables for frame width and stride git-svn-id: svn://svn.videolan.org/x264/trunk@686 df754926-b1dd-0310-bc7b-ec298dee348c commit a8650641e1006d9750cc97d3c1672871c4549296 [revision 685] Author: Guillaume Poirier Date: Mon Nov 12 20:36:33 2007 +0000 Add AltiVec implementation of add4x4_idct, add8x8_idct, add16x16_idct, 3.2x faster on average 1.05x faster overall with default encoding options Patch by Noboru Asai % noboru DD asai AA gmail DD com % git-svn-id: svn://svn.videolan.org/x264/trunk@685 df754926-b1dd-0310-bc7b-ec298dee348c commit 3b6b4c412037f072d0511cf48524987f3b927428 [revision 684] Author: Guillaume Poirier Date: Mon Nov 12 20:28:30 2007 +0000 add AltiVec implementation of dequant_4x4 and dequant_8x8, 2.8x faster than C, 1.01x faster than previous revision with default encoding options Patch by Noboru Asai % noboru DD asai AA gmail DD com % git-svn-id: svn://svn.videolan.org/x264/trunk@684 df754926-b1dd-0310-bc7b-ec298dee348c commit 09334c1a26d8b5485f12c233242d0aaf91003aea [revision 683] Author: Guillaume Poirier Date: Mon Nov 12 12:47:38 2007 +0000 Add AltiVec implementation of quant_2x2_dc, fix Altivec implementation of quant_(4x4|8x8)(|_dc) wrt current C implementation Patch by Noboru Asai % noboru DD asai AA gmail DD com % git-svn-id: svn://svn.videolan.org/x264/trunk@683 df754926-b1dd-0310-bc7b-ec298dee348c commit 57461cb4cec75eff41f1ac5b0f136511c5ccad28 [revision 682] Author: Loren Merritt Date: Thu Nov 1 12:21:13 2007 +0000 fix a possible nondeterminism with me=umh + threads. git-svn-id: svn://svn.videolan.org/x264/trunk@682 df754926-b1dd-0310-bc7b-ec298dee348c commit 22455694153d43a9f85837db6eee641ebc4dcdb6 [revision 681] Author: Loren Merritt Date: Mon Oct 29 14:48:46 2007 +0000 use hex instead of dia for rdo mv refinement. ~0.5% lower bitrate at subme=7. patch by Fiona Glaser. git-svn-id: svn://svn.videolan.org/x264/trunk@681 df754926-b1dd-0310-bc7b-ec298dee348c commit 35094bec4e0202cbdb710b98fa04ea24375531e0 [revision 680] Author: Loren Merritt Date: Mon Sep 24 13:37:44 2007 +0000 port sad_*_x3_sse2 to x86_64 git-svn-id: svn://svn.videolan.org/x264/trunk@680 df754926-b1dd-0310-bc7b-ec298dee348c commit 673ce32a59310b5494049cce140e5420128331ed [revision 679] Author: Loren Merritt Date: Mon Sep 24 11:24:28 2007 +0000 don't overwrite pthread* namespace, because system headers might define those functions even if we don't want them git-svn-id: svn://svn.videolan.org/x264/trunk@679 df754926-b1dd-0310-bc7b-ec298dee348c commit 5c49545c0277a8da1c29809118a734c461116c54 [revision 678] Author: Loren Merritt Date: Fri Sep 21 20:20:22 2007 +0000 faster 4x4 sad git-svn-id: svn://svn.videolan.org/x264/trunk@678 df754926-b1dd-0310-bc7b-ec298dee348c commit a6edfd669f97154571f203cfa69d634a639800ff [revision 677] Author: Loren Merritt Date: Thu Sep 20 08:10:45 2007 +0000 fix an arithmetic overflow in trellis at high qp. git-svn-id: svn://svn.videolan.org/x264/trunk@677 df754926-b1dd-0310-bc7b-ec298dee348c commit 463437926e73b3d18542804cbef31f277d115cc2 [revision 676] Author: Loren Merritt Date: Sat Sep 15 06:34:05 2007 +0000 implement multithreaded me=esa git-svn-id: svn://svn.videolan.org/x264/trunk@676 df754926-b1dd-0310-bc7b-ec298dee348c commit cde5f334121ba1cf6ae13174337ae49008c1f2a4 [revision 675] Author: Loren Merritt Date: Wed Sep 12 05:42:23 2007 +0000 fix some integer overflows. now vbv size can exceed 2 Gbit. git-svn-id: svn://svn.videolan.org/x264/trunk@675 df754926-b1dd-0310-bc7b-ec298dee348c commit d16a4da48b06671e85578ee022729bb2fb6f59c9 [revision 674] Author: Loren Merritt Date: Sun Sep 9 03:17:20 2007 +0000 allow --vbv-init to take absolute values (in kbit), in addition to the previous fractions of vbv-bufsize. git-svn-id: svn://svn.videolan.org/x264/trunk@674 df754926-b1dd-0310-bc7b-ec298dee348c commit 98494077449c4a66ed55cd5ee5a89b7c62e12dd0 [revision 673] Author: Loren Merritt Date: Fri Sep 7 20:40:13 2007 +0000 remove a bashism git-svn-id: svn://svn.videolan.org/x264/trunk@673 df754926-b1dd-0310-bc7b-ec298dee348c commit 759620535bccdb4d872fcc3b798eb6df68c672db [revision 672] Author: Loren Merritt Date: Sun Sep 2 04:32:17 2007 +0000 reorder headers so that largefile support is defined before the first copy of stdio git-svn-id: svn://svn.videolan.org/x264/trunk@672 df754926-b1dd-0310-bc7b-ec298dee348c commit 71af28517442b4388c031d1ce74fa5096beca9bf [revision 671] Author: Loren Merritt Date: Mon Aug 20 16:44:42 2007 +0000 regression in r669: broke saving of configure args if make has to re-run configure git-svn-id: svn://svn.videolan.org/x264/trunk@671 df754926-b1dd-0310-bc7b-ec298dee348c commit 393daac2d7578e0a9d7d541f25026f22f62d03bc [revision 670] Author: Loren Merritt Date: Sat Aug 18 01:13:22 2007 +0000 regression in r669: --enable-shared should imply --enable-pic on some archs. git-svn-id: svn://svn.videolan.org/x264/trunk@670 df754926-b1dd-0310-bc7b-ec298dee348c commit 113800851647deafdacc9a35ee8a1a761f1c777d [revision 669] Author: Loïc Minier Date: Sun Aug 12 12:46:15 2007 +0000 * Add a --host flag to allow overriding config.guess; this is particularly useful with a 64-bits kernel running a 32-bits userland to build 32-bits apps. * Normalize any host triplet into a quadruplet via config.sub. * Move option parsing before any use of architecture information. git-svn-id: svn://svn.videolan.org/x264/trunk@669 df754926-b1dd-0310-bc7b-ec298dee348c commit 3d51a5ba7966542355f0e3904f6e8319b9ba55c2 [revision 668] Author: Loïc Minier Date: Sun Aug 12 12:36:23 2007 +0000 * Update config.guess. git-svn-id: svn://svn.videolan.org/x264/trunk@668 df754926-b1dd-0310-bc7b-ec298dee348c commit 0ea7d1b2ecb6d744c0af674e7b3b8f85eabf3aaa [revision 667] Author: Loren Merritt Date: Tue Jul 17 11:24:26 2007 +0000 mingw doesn't have strtok_r git-svn-id: svn://svn.videolan.org/x264/trunk@667 df754926-b1dd-0310-bc7b-ec298dee348c commit 85f2fc3252f0fae8031fa9e942577d121cb75cd1 [revision 666] Author: Loren Merritt Date: Tue Jul 17 11:11:19 2007 +0000 move os/compiler specific defines to their own header git-svn-id: svn://svn.videolan.org/x264/trunk@666 df754926-b1dd-0310-bc7b-ec298dee348c commit a18f3dab89e6786233c11723900f8b0126e1494d [revision 665] Author: Loren Merritt Date: Thu Jul 12 23:48:23 2007 +0000 extend zones to support (some) encoding parameters in addition to ratecontrol. git-svn-id: svn://svn.videolan.org/x264/trunk@665 df754926-b1dd-0310-bc7b-ec298dee348c commit d5ddf40b1ae1c267ea3f9e998d6ac1e3a4004e07 [revision 664] Author: Loren Merritt Date: Fri Jul 6 17:08:26 2007 +0000 cosmetics git-svn-id: svn://svn.videolan.org/x264/trunk@664 df754926-b1dd-0310-bc7b-ec298dee348c commit ee62378f91c3cfcc028e771b8bc998b4490cc8a1 [revision 663] Author: Loren Merritt Date: Thu Jun 28 21:26:21 2007 +0000 limit vertical motion vectors to +/-512, since some decoders actually depend on that limit. git-svn-id: svn://svn.videolan.org/x264/trunk@663 df754926-b1dd-0310-bc7b-ec298dee348c commit 303175413b0f1f38d488c7decbce9d7ccb81f647 [revision 662] Author: Guillaume Poirier Date: Fri Jun 22 21:42:41 2007 +0000 Add vertical and horizontal luma deblocking accelerated with Altivec, based on Graham Booker's code written for FFmpeg with slight modifications to re-use x264's macros git-svn-id: svn://svn.videolan.org/x264/trunk@662 df754926-b1dd-0310-bc7b-ec298dee348c commit 9acd233b21bb1c0808b2f4c4c100511b3d6d45a1 [revision 661] Author: Loren Merritt Date: Sat Jun 16 04:09:01 2007 +0000 cosmetics in cpu detection git-svn-id: svn://svn.videolan.org/x264/trunk@661 df754926-b1dd-0310-bc7b-ec298dee348c commit 80090ccfb3493659451c21506d77dcacda6bcab2 [revision 660] Author: Loren Merritt Date: Sat Jun 16 04:02:48 2007 +0000 fix compilation without asm on x86_32 (r658 worked only on x86_64). git-svn-id: svn://svn.videolan.org/x264/trunk@660 df754926-b1dd-0310-bc7b-ec298dee348c commit fedfacea656db5a327fba0c449fe70181539f876 [revision 659] Author: Loren Merritt Date: Sun Jun 10 23:46:31 2007 +0000 exempt 1080p from the non-mod16 warning. git-svn-id: svn://svn.videolan.org/x264/trunk@659 df754926-b1dd-0310-bc7b-ec298dee348c commit 17dd119d8275224ad7f830dcdf4a5458150fcd31 [revision 658] Author: Alex Izvorski Date: Tue Jun 5 18:38:31 2007 +0000 allow compiling without yasm/nasm on x86 and x86-64 platforms git-svn-id: svn://svn.videolan.org/x264/trunk@658 df754926-b1dd-0310-bc7b-ec298dee348c commit 26bed72be4941353b4c79eddf500ce39aaf75490 [revision 657] Author: Alex Izvorski Date: Tue Jun 5 18:32:13 2007 +0000 updated MS VC8/VC7 build, patch by Gabriel Bouvigne git-svn-id: svn://svn.videolan.org/x264/trunk@657 df754926-b1dd-0310-bc7b-ec298dee348c commit a35548661f5f50c22516147b9b164f25d44a69db [revision 656] Author: Alex Izvorski Date: Sat May 26 03:13:08 2007 +0000 replace alloca with malloc everywhere. per manpage, use of alloca is discouraged. this may have a minor effect on the speed of ssim and esa, but that appears too small to measure. git-svn-id: svn://svn.videolan.org/x264/trunk@656 df754926-b1dd-0310-bc7b-ec298dee348c commit ffa4e76d4573303271c768f6ec03c21f0f2f4f02 [revision 655] Author: Loren Merritt Date: Wed May 2 21:33:43 2007 +0000 require a ratecontrol method to be specified, it no longer defaults to cqp=26. git-svn-id: svn://svn.videolan.org/x264/trunk@655 df754926-b1dd-0310-bc7b-ec298dee348c commit b5ef788a7c6d2c7ba2cdb664ffd5544603979495 [revision 654] Author: Loren Merritt Date: Mon Apr 23 08:34:52 2007 +0000 fix nnz computation in cavlc+8x8dct+deblock. (regression in r607) git-svn-id: svn://svn.videolan.org/x264/trunk@654 df754926-b1dd-0310-bc7b-ec298dee348c commit 08b4f6956135d54693b4e618f58fd7f68e654473 [revision 653] Author: Loren Merritt Date: Mon Apr 23 07:09:30 2007 +0000 fix the computation of bits used for vbv. (regression in r651) git-svn-id: svn://svn.videolan.org/x264/trunk@653 df754926-b1dd-0310-bc7b-ec298dee348c commit fe85aca1c58ef4975aee8e6e32b4d57624960367 [revision 652] Author: Loren Merritt Date: Sun Apr 22 03:37:56 2007 +0000 c89 compile fix git-svn-id: svn://svn.videolan.org/x264/trunk@652 df754926-b1dd-0310-bc7b-ec298dee348c commit b3076aef6c40f10260ef7386e3f2e028997da5d5 [revision 651] Author: Loren Merritt Date: Sat Apr 21 11:32:34 2007 +0000 cabac: use bytestream instead of bitstream. 35% faster cabac, 20% faster overall lossless, ~1% faster overall at normal bitrates. git-svn-id: svn://svn.videolan.org/x264/trunk@651 df754926-b1dd-0310-bc7b-ec298dee348c commit 8300d3344b73612ec449a2c2ee259c654fef9d0a [revision 650] Author: Loren Merritt Date: Wed Apr 11 22:21:15 2007 +0000 remove the restriction on number of threads as a function of resolution (it was wrong anyway in the presence of B-frames), and raise the max number of threads in general (though more will have to be done before it can really scale to lots of cores). git-svn-id: svn://svn.videolan.org/x264/trunk@650 df754926-b1dd-0310-bc7b-ec298dee348c commit 8ecf4b912011e9a097d570086553f4685d5dddc5 [revision 649] Author: Loren Merritt Date: Tue Apr 10 22:37:18 2007 +0000 tweak ssse3 quant git-svn-id: svn://svn.videolan.org/x264/trunk@649 df754926-b1dd-0310-bc7b-ec298dee348c commit c266480eb24e7886877f52436056e99fced02cdc [revision 648] Author: Loren Merritt Date: Sat Apr 7 04:53:16 2007 +0000 change some tables from int to int8_t. 13KB smaller executable. git-svn-id: svn://svn.videolan.org/x264/trunk@648 df754926-b1dd-0310-bc7b-ec298dee348c commit 2f66c11a4eeb17950b3aee18cc105572e860ec44 [revision 647] Author: Loren Merritt Date: Fri Apr 6 21:45:33 2007 +0000 faster cabac rdo. up to 10% faster at q0, but negligible at normal bitrates. git-svn-id: svn://svn.videolan.org/x264/trunk@647 df754926-b1dd-0310-bc7b-ec298dee348c commit 3e7b136c8525f73f6e01be260adbfc15c34503d7 [revision 646] Author: Loren Merritt Date: Fri Apr 6 21:17:34 2007 +0000 workaround gcc's inability to align variables on the stack. this crash was introduced in r642, but only because previous versions didn't use sse2 on the stack. git-svn-id: svn://svn.videolan.org/x264/trunk@646 df754926-b1dd-0310-bc7b-ec298dee348c commit 84676d2eba9fc18d62e168b60d7d1118d1c232d3 [revision 645] Author: Loren Merritt Date: Thu Apr 5 16:11:03 2007 +0000 32bit version of ssse3 satd. switch default assembler to yasm. it will still fallback to nasm if you don't have yasm. git-svn-id: svn://svn.videolan.org/x264/trunk@645 df754926-b1dd-0310-bc7b-ec298dee348c commit 71c097b28e1405076f554d3b948885fd69c1774f [revision 644] Author: Loren Merritt Date: Wed Apr 4 19:34:02 2007 +0000 simplify trellis git-svn-id: svn://svn.videolan.org/x264/trunk@644 df754926-b1dd-0310-bc7b-ec298dee348c commit 12681eea55239d21eb8ce0ca058c5d859afedc9f [revision 643] Author: Loren Merritt Date: Wed Apr 4 18:59:20 2007 +0000 fix an arithmetic overflow in trellis with QP >= 42 git-svn-id: svn://svn.videolan.org/x264/trunk@643 df754926-b1dd-0310-bc7b-ec298dee348c commit 10265a0c2a0b29e6252ad3be6fad1569e7a04339 [revision 642] Author: Loren Merritt Date: Wed Apr 4 18:45:25 2007 +0000 2x faster quant. 2% overall. side effects: not bit-identical to the previous algorithm. while the new algorithm covers a wider range of cqms than the previous one did, I couldn't find a good way to fallback to a general version for the extreme cqms. so now it refuses to encode extreme cqms instead of just being slower. lays a framework for custom deadzone matrices, though I didn't add an api. git-svn-id: svn://svn.videolan.org/x264/trunk@642 df754926-b1dd-0310-bc7b-ec298dee348c commit b37ac36e8442dd8fd8f25933b6a0a119d471e8f7 [revision 641] Author: Loren Merritt Date: Wed Apr 4 18:35:51 2007 +0000 when encoding with a cqm, probe_skip now also uses the cqm, instead of the flat matrix git-svn-id: svn://svn.videolan.org/x264/trunk@641 df754926-b1dd-0310-bc7b-ec298dee348c commit e3a07e098f96dfc2dbde8da6cad77ed012d4397e [revision 640] Author: Loren Merritt Date: Wed Apr 4 00:48:55 2007 +0000 cosmetics in asm macros git-svn-id: svn://svn.videolan.org/x264/trunk@640 df754926-b1dd-0310-bc7b-ec298dee348c commit 71943e8acbfdb85d944f2800e49bcb7902afaaf3 [revision 639] Author: Alex Izvorski Date: Tue Apr 3 17:18:17 2007 +0000 use only c-style comments in public header (patch by Vincent Torres) git-svn-id: svn://svn.videolan.org/x264/trunk@639 df754926-b1dd-0310-bc7b-ec298dee348c commit dfb854775c7b52945a84ef756dc88a4ccb7c2d2c [revision 638] Author: Loren Merritt Date: Mon Apr 2 23:56:09 2007 +0000 in hpel search, merge two 16x16 mc calls into one 16x17. 15% faster hpel, .3% overall. git-svn-id: svn://svn.videolan.org/x264/trunk@638 df754926-b1dd-0310-bc7b-ec298dee348c commit e63c3924ef2c9790aa6440fa11dddf1026862f23 [revision 637] Author: Christophe Mutricy Date: Mon Apr 2 19:17:28 2007 +0000 Compile fix git-svn-id: svn://svn.videolan.org/x264/trunk@637 df754926-b1dd-0310-bc7b-ec298dee348c commit dd7e21c6cba26298eac190abcb702b490bd98c5e [revision 636] Author: Loren Merritt Date: Fri Mar 30 20:20:36 2007 +0000 remove private stuff from public headers. no more need for -D__X264__ git-svn-id: svn://svn.videolan.org/x264/trunk@636 df754926-b1dd-0310-bc7b-ec298dee348c commit 87fdea89e007258c2d48156f14b883ef29c02831 [revision 635] Author: Loren Merritt Date: Sat Mar 24 12:58:27 2007 +0000 adjust bitstream buffer sizes for very large frames git-svn-id: svn://svn.videolan.org/x264/trunk@635 df754926-b1dd-0310-bc7b-ec298dee348c commit 8b37cc6aa5e23bf4a529b79745f73dedab1fa4ff [revision 634] Author: Loren Merritt Date: Wed Mar 14 22:41:50 2007 +0000 conflate HAVE_MMXEXT with HAVE_SSE2, since they were never used distinctly. git-svn-id: svn://svn.videolan.org/x264/trunk@634 df754926-b1dd-0310-bc7b-ec298dee348c commit 11ef32f432b8e055c30c99531e25320dbce8f656 [revision 633] Author: Sam Hocevar Date: Wed Mar 14 21:53:47 2007 +0000 * Made -DNEED_ALTIVEC unnecessary, thanks to Guillaume Poirier. git-svn-id: svn://svn.videolan.org/x264/trunk@633 df754926-b1dd-0310-bc7b-ec298dee348c commit 37ad2377b8781330d400985ce12a1ec61067a5e4 [revision 632] Author: Sam Hocevar Date: Wed Mar 14 21:31:50 2007 +0000 * check x264_cpu_detect() before calling AltiVec functions. git-svn-id: svn://svn.videolan.org/x264/trunk@632 df754926-b1dd-0310-bc7b-ec298dee348c commit 8aef0e941d986f10427cc2d3a848162065bdef3a [revision 631] Author: Loren Merritt Date: Wed Mar 14 21:11:11 2007 +0000 ssse3 detection. x86_64 ssse3 satd and quant. requires yasm >= 0.6.0 git-svn-id: svn://svn.videolan.org/x264/trunk@631 df754926-b1dd-0310-bc7b-ec298dee348c commit 1980de9bba111561be5ad3dde37b6f7a29a80a4e [revision 630] Author: Sam Hocevar Date: Wed Mar 14 20:40:24 2007 +0000 * Use -maltivec when building dependencies, or cannot be used. * Do not declare vectors in non-AltiVec files. git-svn-id: svn://svn.videolan.org/x264/trunk@630 df754926-b1dd-0310-bc7b-ec298dee348c commit e1a4ae9ef58f461aa8ca1e0a1f88140a61d03680 [revision 629] Author: Sam Hocevar Date: Wed Mar 14 18:04:06 2007 +0000 * common/cpu.c: runtime AltiVec autodetection on Linux. * configure, Makefile: do not build the whole project with -maltivec because it generates AltiVec code in weird places. git-svn-id: svn://svn.videolan.org/x264/trunk@629 df754926-b1dd-0310-bc7b-ec298dee348c commit f81c3eafa27ad90a02c1d87a9f74b509199ddb63 [revision 628] Author: Loren Merritt Date: Mon Mar 5 15:35:42 2007 +0000 fix a small memleak. patch by Limin Wang. git-svn-id: svn://svn.videolan.org/x264/trunk@628 df754926-b1dd-0310-bc7b-ec298dee348c commit 62fc8a9c507166baaf104e69fd4c055d41fb0dce [revision 627] Author: Guillaume Poirier Date: Sat Mar 3 12:59:23 2007 +0000 compile fix for GCC-3.3 on OSX, based on a patch by Patrice Bensoussan % patrice P bensoussan A free P fr% Note: regression test still do not pass with GCC-3.3, but they never did as far as I can remember. git-svn-id: svn://svn.videolan.org/x264/trunk@627 df754926-b1dd-0310-bc7b-ec298dee348c commit 89930744ed6921c5dfb0ffdfad2ca6059b2e6c6a [revision 626] Author: Loren Merritt Date: Sat Mar 3 12:12:54 2007 +0000 cosmetics in regression test git-svn-id: svn://svn.videolan.org/x264/trunk@626 df754926-b1dd-0310-bc7b-ec298dee348c commit 6e1bbd2ff961ee51d8a5bd566784f19fb194c028 [revision 625] Author: Alex Izvorski Date: Sat Mar 3 11:44:01 2007 +0000 regression testing, run similar to fprofiled: VIDS='vid_720x480.yuv' make test git-svn-id: svn://svn.videolan.org/x264/trunk@625 df754926-b1dd-0310-bc7b-ec298dee348c commit 435b675bac7e89d34644affd69bf88a82ddc9242 [revision 624] Author: Alex Izvorski Date: Wed Feb 28 18:47:04 2007 +0000 add ability to generate doxygen documentation; make dox git-svn-id: svn://svn.videolan.org/x264/trunk@624 df754926-b1dd-0310-bc7b-ec298dee348c commit dac2be0cc94b897749381135e248cd1844f58fda [revision 623] Author: Loren Merritt Date: Thu Feb 22 05:01:38 2007 +0000 oops, scenecut detection failed to activate when using threads and not using B-frames git-svn-id: svn://svn.videolan.org/x264/trunk@623 df754926-b1dd-0310-bc7b-ec298dee348c commit 49d16d48a132b3de8b80a59752e95dbe896e7477 [revision 622] Author: Loren Merritt Date: Mon Jan 29 14:42:42 2007 +0000 extras/getopt.c was BSD licensed. replace with a LGPL version (from glibc). git-svn-id: svn://svn.videolan.org/x264/trunk@622 df754926-b1dd-0310-bc7b-ec298dee348c commit 972172560df8efb3d2162938526b3a3c812710a4 [revision 621] Author: Guillaume Poirier Date: Thu Jan 25 08:32:16 2007 +0000 Fix build issues on Linux. Only gcc-4.x is supported, as on OSX. Cleans up a few inconsistencies in the code too. git-svn-id: svn://svn.videolan.org/x264/trunk@621 df754926-b1dd-0310-bc7b-ec298dee348c commit c9cd0fce3fcdb83fee9a49987abaa9983b4d1cf4 [revision 620] Author: Loren Merritt Date: Sun Jan 21 12:12:04 2007 +0000 tweak block_residual_write_cavlc. up to 1% faster lossless, no difference at normal bitrates. git-svn-id: svn://svn.videolan.org/x264/trunk@620 df754926-b1dd-0310-bc7b-ec298dee348c commit 4c949c31a111c461b0362409baaff4550b7f664f [revision 619] Author: Loren Merritt Date: Sat Jan 20 05:07:44 2007 +0000 don't assume int is exactly 4 bytes git-svn-id: svn://svn.videolan.org/x264/trunk@619 df754926-b1dd-0310-bc7b-ec298dee348c commit cf9e6c5c0ea4771da523d1a95517b2276cca6cf1 [revision 618] Author: Loren Merritt Date: Thu Jan 11 23:55:51 2007 +0000 make array_non_zero() compatible with -fstrict-aliasing git-svn-id: svn://svn.videolan.org/x264/trunk@618 df754926-b1dd-0310-bc7b-ec298dee348c commit 285f98e197be56efde6b2c42832625193c432c54 [revision 617] Author: Christophe Mutricy Date: Tue Jan 9 20:25:32 2007 +0000 Honor CFLAGS and LDFLAGS set by the user git-svn-id: svn://svn.videolan.org/x264/trunk@617 df754926-b1dd-0310-bc7b-ec298dee348c commit 0fe97f423591939454ac7d272fe5fb5dde837b3f [revision 616] Author: Eric Petit Date: Tue Jan 2 14:51:10 2007 +0000 Check whether 'echo -n' works, otherwise try printf (fixes build on current OS X 10.5) git-svn-id: svn://svn.videolan.org/x264/trunk@616 df754926-b1dd-0310-bc7b-ec298dee348c commit bbc68bea7c30988d9f59e2ac99f88f9188f65654 [revision 615] Author: Eric Petit Date: Mon Jan 1 22:41:44 2007 +0000 Check version of nasm on OS X / Intel git-svn-id: svn://svn.videolan.org/x264/trunk@615 df754926-b1dd-0310-bc7b-ec298dee348c commit b630af6c4af3118453d91aae75d2e61c513f2010 [revision 614] Author: Loren Merritt Date: Wed Dec 20 04:22:59 2006 +0000 wrong reference frames were used with refs>=14 + pyramid (regression in r607) git-svn-id: svn://svn.videolan.org/x264/trunk@614 df754926-b1dd-0310-bc7b-ec298dee348c commit 01e1db245d0c1c6119a8316994f2083c6096811a [revision 613] Author: Loren Merritt Date: Tue Dec 19 21:24:47 2006 +0000 enable thread synchronization primitives on linux too git-svn-id: svn://svn.videolan.org/x264/trunk@613 df754926-b1dd-0310-bc7b-ec298dee348c commit 34c6fb35ac799907caed6ad1cbbc287136c6b6c6 [revision 612] Author: Loren Merritt Date: Tue Dec 19 09:35:45 2006 +0000 fix a crash with x264_encoder_headers() + threads git-svn-id: svn://svn.videolan.org/x264/trunk@612 df754926-b1dd-0310-bc7b-ec298dee348c commit dfe7bb017bface15c9ed47ab5924062300a79e5c [revision 611] Author: Loren Merritt Date: Sat Dec 16 00:46:37 2006 +0000 don't skip autodection on configure --enable-pthread git-svn-id: svn://svn.videolan.org/x264/trunk@611 df754926-b1dd-0310-bc7b-ec298dee348c commit ab3b6602d733c9382a17b9e20a4478efd7ae5994 [revision 610] Author: Loren Merritt Date: Sat Dec 16 00:32:38 2006 +0000 more win32threads -> pthreads git-svn-id: svn://svn.videolan.org/x264/trunk@610 df754926-b1dd-0310-bc7b-ec298dee348c commit cc753d6ba6634cfeee9c8b461bc996adf7e1aee6 [revision 609] Author: Loren Merritt Date: Fri Dec 15 23:08:57 2006 +0000 cosmetics: rename list operators to be consistent with Perl, and move them to common/ git-svn-id: svn://svn.videolan.org/x264/trunk@609 df754926-b1dd-0310-bc7b-ec298dee348c commit 87f1430384a1c61035a4ca595e5defecf4b64cb8 [revision 608] Author: Loren Merritt Date: Fri Dec 15 23:06:21 2006 +0000 win32: use pthreads instead of win32threads. for some reason, pthreads is much faster. git-svn-id: svn://svn.videolan.org/x264/trunk@608 df754926-b1dd-0310-bc7b-ec298dee348c commit 7b4f6a1fd95c7e0ab479e116fe59e66e5d1fd107 [revision 607] Author: Loren Merritt Date: Fri Dec 15 23:03:36 2006 +0000 New threading method: Encode multiple frames in prallel instead of dividing each frame into slices. Improves speed, and reduces the bitrate penalty of threading. Side effects: It is no longer possible to re-encode a frame, so threaded scenecut detection must run in the pre-me pass, which is faster but less precise. It is now useful to use more threads than you have cpus. --threads=auto has been updated to use cpus*1.5. Minor changes to ratecontrol. New options: --pre-scenecut, --mvrange-thread, --non-deterministic git-svn-id: svn://svn.videolan.org/x264/trunk@607 df754926-b1dd-0310-bc7b-ec298dee348c commit fa2c1e5430619c6011dfe2ffbebbd59557afa228 [revision 606] Author: Sam Hocevar Date: Tue Dec 12 02:17:44 2006 +0000 * Do not assume anything about sizeof(cpu_set_t). git-svn-id: svn://svn.videolan.org/x264/trunk@606 df754926-b1dd-0310-bc7b-ec298dee348c commit 167abec138939a7475b481c6cf4a3738601f279a [revision 605] Author: Sam Hocevar Date: Mon Dec 11 16:01:49 2006 +0000 * Add support for kFreeBSD (FreeBSD kernel with GNU userland). git-svn-id: svn://svn.videolan.org/x264/trunk@605 df754926-b1dd-0310-bc7b-ec298dee348c commit e8cc72c7acc75df584cece7da9176eb4be6c9d36 [revision 604] Author: Guillaume Poirier Date: Mon Nov 27 21:40:21 2006 +0000 Add Altivec implementations of add8x8_idct8, add16x16_idct8, sa8d_8x8 and sa8d_16x16 Note: doesn't take advantage of some possible aligned memory accesses, so there's still room for improvement git-svn-id: svn://svn.videolan.org/x264/trunk@604 df754926-b1dd-0310-bc7b-ec298dee348c commit 575b238cb5e5bdb18b5541456d6eb4781bcd36d5 [revision 603] Author: Eric Petit Date: Sat Nov 25 16:31:24 2006 +0000 Force alignment of the fake .rodata on MacIntel git-svn-id: svn://svn.videolan.org/x264/trunk@603 df754926-b1dd-0310-bc7b-ec298dee348c commit 4659f518e453250783fd13bedda8a8560e8d32a5 [revision 602] Author: Loren Merritt Date: Thu Nov 23 03:13:18 2006 +0000 don't treat vbv_maxrate as a minrate too if it's higher than target average bitrate. git-svn-id: svn://svn.videolan.org/x264/trunk@602 df754926-b1dd-0310-bc7b-ec298dee348c commit c494e9c5e22d0fbe30deb7f2c1b932ac9fb449b5 [revision 601] Author: Eric Petit Date: Sat Nov 18 14:38:07 2006 +0000 Merges Guillaume Poirier's AltiVec changes: * Adds optimized quant and sub*dct8 routines * Faster sub*dct routines ~8% overall speed-up with default settings git-svn-id: svn://svn.videolan.org/x264/trunk@601 df754926-b1dd-0310-bc7b-ec298dee348c commit 41c111bc13badf2f7fa71165c94d4b528213a8cd [revision 600] Author: Loren Merritt Date: Mon Nov 6 22:49:41 2006 +0000 10% faster deblock mmx functions. ported from ffmpeg. git-svn-id: svn://svn.videolan.org/x264/trunk@600 df754926-b1dd-0310-bc7b-ec298dee348c commit 6eeb78f6a8bc0541ea68fb84ba41c456e56a2589 [revision 599] Author: Loren Merritt Date: Mon Nov 6 22:38:42 2006 +0000 checkasm: ignore insignificant differences in floating-point ssim git-svn-id: svn://svn.videolan.org/x264/trunk@599 df754926-b1dd-0310-bc7b-ec298dee348c commit 485172deeb75ca1eb88ac24b6ab6719b8a16825b [revision 598] Author: Loren Merritt Date: Mon Oct 30 02:31:48 2006 +0000 display final ratefactor in abr when a loose vbv is applied. (still disabled in true cbr) git-svn-id: svn://svn.videolan.org/x264/trunk@598 df754926-b1dd-0310-bc7b-ec298dee348c commit a9d754a37787daccdd88e4d08ec95f9b9ae59a8c [revision 597] Author: Loren Merritt Date: Mon Oct 30 00:09:21 2006 +0000 fix parsing of --deblock %d,%d (beta was ignored) git-svn-id: svn://svn.videolan.org/x264/trunk@597 df754926-b1dd-0310-bc7b-ec298dee348c commit ccac553deccf9646fdc53728f91caa988ee09176 [revision 596] Author: Loren Merritt Date: Sun Oct 29 05:48:57 2006 +0000 compute chroma_qp only once per mb git-svn-id: svn://svn.videolan.org/x264/trunk@596 df754926-b1dd-0310-bc7b-ec298dee348c commit 94a4aede9033614fa810ec88ca92dcd39822b544 [revision 595] Author: Loren Merritt Date: Sun Oct 29 01:17:33 2006 +0000 rd refinement of intra chroma direction (enabled in --subme 7) patch by Alex Wright. git-svn-id: svn://svn.videolan.org/x264/trunk@595 df754926-b1dd-0310-bc7b-ec298dee348c commit e7a4aba997a82cfd5c1d5217d413513d52b6af3e [revision 594] Author: Loren Merritt Date: Wed Oct 18 04:06:44 2006 +0000 fix a crash in avc2avi git-svn-id: svn://svn.videolan.org/x264/trunk@594 df754926-b1dd-0310-bc7b-ec298dee348c commit 41b85c4fa872df880352024aa2d707b1813a4258 [revision 593] Author: Loren Merritt Date: Sun Oct 15 23:39:03 2006 +0000 skip deblocking and motion interpolation when using only I-frames git-svn-id: svn://svn.videolan.org/x264/trunk@593 df754926-b1dd-0310-bc7b-ec298dee348c commit 197a94a8cfff879ef3691eb178b5b617bda03ffb [revision 592] Author: Loren Merritt Date: Fri Oct 13 23:50:57 2006 +0000 cosmetics git-svn-id: svn://svn.videolan.org/x264/trunk@592 df754926-b1dd-0310-bc7b-ec298dee348c commit 93b54ce1fbbbb191e69d8c6e33897037ba5bbfd1 [revision 591] Author: Loren Merritt Date: Fri Oct 13 20:04:58 2006 +0000 allow fractional values of crf git-svn-id: svn://svn.videolan.org/x264/trunk@591 df754926-b1dd-0310-bc7b-ec298dee348c commit 8d1ebe2eeb30a204b588502d69d361ee85187821 [revision 590] Author: Loren Merritt Date: Tue Oct 10 21:26:31 2006 +0000 prefetch pixels for motion compensation and deblocking. git-svn-id: svn://svn.videolan.org/x264/trunk@590 df754926-b1dd-0310-bc7b-ec298dee348c commit 9fadbd7b82a1bd785e206dfef81af066c7470a2f [revision 589] Author: Loren Merritt Date: Tue Oct 10 19:16:39 2006 +0000 fix a crash on interlace + >8 reference frames git-svn-id: svn://svn.videolan.org/x264/trunk@589 df754926-b1dd-0310-bc7b-ec298dee348c commit 4107178cc2998dd37cca106898e16e833a99b50e [revision 588] Author: Loren Merritt Date: Tue Oct 10 05:05:55 2006 +0000 no more decoder. it never worked anyway, and the presence of defunct code was confusing people. git-svn-id: svn://svn.videolan.org/x264/trunk@588 df754926-b1dd-0310-bc7b-ec298dee348c commit 9243c844dfc47d6dea825a0a92bd8a990d6c3cac [revision 587] Author: Loren Merritt Date: Mon Oct 9 23:31:45 2006 +0000 compute pskip_mv only once per macroblock, and store it git-svn-id: svn://svn.videolan.org/x264/trunk@587 df754926-b1dd-0310-bc7b-ec298dee348c commit 0a453377cdac1aa50a4f21be0ddfba2a93719603 [revision 586] Author: Loren Merritt Date: Mon Oct 9 20:55:54 2006 +0000 slightly faster chroma_mc_mmx git-svn-id: svn://svn.videolan.org/x264/trunk@586 df754926-b1dd-0310-bc7b-ec298dee348c commit 42bb1b494e19732fb510fb42f30bf5ae0a66dd71 [revision 585] Author: Loren Merritt Date: Mon Oct 9 17:44:47 2006 +0000 missing emms in plane_copy_mmx git-svn-id: svn://svn.videolan.org/x264/trunk@585 df754926-b1dd-0310-bc7b-ec298dee348c commit 43e4162f3026f964d0bcf4502afe13cdaaa53f4d [revision 584] Author: Loren Merritt Date: Fri Oct 6 23:25:41 2006 +0000 merge center_filter_mmx with horizontal_filter_mmx git-svn-id: svn://svn.videolan.org/x264/trunk@584 df754926-b1dd-0310-bc7b-ec298dee348c commit c485e7e75bf24036c9438467ba854bd17122e277 [revision 583] Author: Loren Merritt Date: Fri Oct 6 05:43:53 2006 +0000 1.5x faster center_filter_mmx (amd64) git-svn-id: svn://svn.videolan.org/x264/trunk@583 df754926-b1dd-0310-bc7b-ec298dee348c commit 299d3ed4ee3c8e875a428bf872bfe791a7ccd687 [revision 582] Author: Loren Merritt Date: Fri Oct 6 00:02:59 2006 +0000 mmx/prefetch implementation of plane_copy git-svn-id: svn://svn.videolan.org/x264/trunk@582 df754926-b1dd-0310-bc7b-ec298dee348c commit 9f05a7fdfc08e0279a825dc851b61c61063564b3 [revision 581] Author: Loren Merritt Date: Thu Oct 5 08:15:55 2006 +0000 no more vfw git-svn-id: svn://svn.videolan.org/x264/trunk@581 df754926-b1dd-0310-bc7b-ec298dee348c commit aabb91c80b714e9a8e2bbdca8beaf5655e9e9540 [revision 580] Author: Loren Merritt Date: Thu Oct 5 07:44:22 2006 +0000 gtk fixes: in Makefile - fix datadir for mingw users - remove the shared lib during the clean rule - use $(ENCODE_BIN) instead of x264_gtk_encode - add some $(DESTDIR) and create some directories when necessary - remove -lintl statfile_length -> statsfile_length fix the "sensitivity" of the widget of update_statfile the logo is now handled correctly on windows added: beginning of multipass support patch by Vincent Torri. git-svn-id: svn://svn.videolan.org/x264/trunk@580 df754926-b1dd-0310-bc7b-ec298dee348c commit 90a9219545cf073847160933d1c7d3e9b761e909 [revision 579] Author: Loren Merritt Date: Thu Oct 5 01:57:00 2006 +0000 accept mencoder's option names as synonyms (api only, not in x264cli) git-svn-id: svn://svn.videolan.org/x264/trunk@579 df754926-b1dd-0310-bc7b-ec298dee348c commit cb8b5dad7dc99c13e591aaf74b84b406ea80b69e [revision 578] Author: Loren Merritt Date: Tue Oct 3 01:39:38 2006 +0000 simplify satd_sse2 git-svn-id: svn://svn.videolan.org/x264/trunk@578 df754926-b1dd-0310-bc7b-ec298dee348c commit 04834f596bb9bec1e8afd7ab7b1eee74ad13df0a [revision 577] Author: Loren Merritt Date: Mon Oct 2 08:31:48 2006 +0000 better error checking in x264_param_parse. add synonyms for a few options. git-svn-id: svn://svn.videolan.org/x264/trunk@577 df754926-b1dd-0310-bc7b-ec298dee348c commit 0cbf0fc27bd391509d788ebb2e06b930bf840925 [revision 576] Author: Loren Merritt Date: Mon Oct 2 02:46:23 2006 +0000 fix some strides that weren't a multiple of 16. git-svn-id: svn://svn.videolan.org/x264/trunk@576 df754926-b1dd-0310-bc7b-ec298dee348c commit 58e12b0e792596ae4eac95df4ae358ca664a6c20 [revision 575] Author: Loren Merritt Date: Sun Oct 1 13:08:42 2006 +0000 tweak motion compensation amd64 asm. 0.3% overall speedup. git-svn-id: svn://svn.videolan.org/x264/trunk@575 df754926-b1dd-0310-bc7b-ec298dee348c commit 6de50f51151c29331040f8c24bf3697c063fe0dd [revision 574] Author: Loren Merritt Date: Sun Oct 1 08:06:22 2006 +0000 strip local symbols from asm .o files, since they confuse oprofile git-svn-id: svn://svn.videolan.org/x264/trunk@574 df754926-b1dd-0310-bc7b-ec298dee348c commit f9cc941183c0b190d09e030a30c537259c3e4088 [revision 573] Author: Loren Merritt Date: Sun Oct 1 07:25:01 2006 +0000 add an option to control direct_8x8_inference_flag, default to enabled. slightly faster encoding and decoding of p4x4 + B-frames, and is needed for strict Levels compliance. git-svn-id: svn://svn.videolan.org/x264/trunk@573 df754926-b1dd-0310-bc7b-ec298dee348c commit 8f0864a9f5755a02158cbb96c215afee95846639 [revision 572] Author: Loren Merritt Date: Sun Oct 1 03:05:15 2006 +0000 allow custom deadzones for non-trellis quantization. patch by Alex Wright. git-svn-id: svn://svn.videolan.org/x264/trunk@572 df754926-b1dd-0310-bc7b-ec298dee348c commit a7cd9cf8ba99e4703805d34eef494d47850f9b99 [revision 571] Author: Loren Merritt Date: Sun Oct 1 02:44:36 2006 +0000 move zigzag scan functions to dsp function pointers. mmx implementation of interlaced zigzag. git-svn-id: svn://svn.videolan.org/x264/trunk@571 df754926-b1dd-0310-bc7b-ec298dee348c commit faec300a71cdb64e1bd27d393de51d2e3d1f5992 [revision 570] Author: Loren Merritt Date: Sun Oct 1 02:41:22 2006 +0000 support interlace. uses MBAFF syntax, but is not adaptive yet. git-svn-id: svn://svn.videolan.org/x264/trunk@570 df754926-b1dd-0310-bc7b-ec298dee348c commit 3b7857057005ebecc9852ef56c9f725d26b94bc4 [revision 569] Author: Loren Merritt Date: Wed Sep 27 06:37:19 2006 +0000 allow --zones in cqp encodes git-svn-id: svn://svn.videolan.org/x264/trunk@569 df754926-b1dd-0310-bc7b-ec298dee348c commit 7960eaf2451a445078a3c55b1ade5efc1948e02f [revision 568] Author: Loren Merritt Date: Tue Sep 26 19:27:07 2006 +0000 cli: fix some typos in vui parameters from r542. patch by Foxy Shadis. git-svn-id: svn://svn.videolan.org/x264/trunk@568 df754926-b1dd-0310-bc7b-ec298dee348c commit 1b93c2a5cccb6ad7aedfd2746c671dcfffe60795 [revision 567] Author: Sam Hocevar Date: Mon Sep 25 10:25:55 2006 +0000 * Add an "all" rule to the Makefile. Ideally "default" should be renamed, but I don't want to break existing scripts. git-svn-id: svn://svn.videolan.org/x264/trunk@567 df754926-b1dd-0310-bc7b-ec298dee348c commit 52b8a0530721abb276c286d95a7827da3529baa4 [revision 566] Author: Loren Merritt Date: Sun Sep 24 21:35:56 2006 +0000 workaround: on some systems, alloca() isn't aligned git-svn-id: svn://svn.videolan.org/x264/trunk@566 df754926-b1dd-0310-bc7b-ec298dee348c commit 7bfc7360d713aa651b71691a9fb85f34104573e4 [revision 565] Author: Loren Merritt Date: Fri Sep 22 16:39:05 2006 +0000 missing picpop git-svn-id: svn://svn.videolan.org/x264/trunk@565 df754926-b1dd-0310-bc7b-ec298dee348c commit 460699ffdfaf263faaebd720c82475d1fb949279 [revision 564] Author: Loren Merritt Date: Wed Sep 13 19:24:13 2006 +0000 fix a buffer overread from r540 git-svn-id: svn://svn.videolan.org/x264/trunk@564 df754926-b1dd-0310-bc7b-ec298dee348c commit 55c208edb7e3a3ad54c1c1412577003b1eb13a69 [revision 563] Author: Loren Merritt Date: Tue Sep 12 23:32:21 2006 +0000 cosmetics (spelling) git-svn-id: svn://svn.videolan.org/x264/trunk@563 df754926-b1dd-0310-bc7b-ec298dee348c commit 8850b6faaf55b83ed3aa86ff9fcb5e35c439b236 [revision 562] Author: Loren Merritt Date: Tue Sep 12 22:21:23 2006 +0000 faster ESA git-svn-id: svn://svn.videolan.org/x264/trunk@562 df754926-b1dd-0310-bc7b-ec298dee348c commit f8652aab3dda281aa446ead0674d7e1f1c6d6e74 [revision 561] Author: Loren Merritt Date: Tue Sep 12 22:18:29 2006 +0000 faster ESA git-svn-id: svn://svn.videolan.org/x264/trunk@561 df754926-b1dd-0310-bc7b-ec298dee348c commit a020d6ec4933d9f431be5bad429289b87efabe17 [revision 560] Author: Sam Hocevar Date: Sun Sep 10 17:37:13 2006 +0000 * Use the autotool's config.guess script instead of uname to check the system and CPU types, to avoid issues when using for instance a 32-bit userland on top of a 64-bit kernel. git-svn-id: svn://svn.videolan.org/x264/trunk@560 df754926-b1dd-0310-bc7b-ec298dee348c commit 477c5bfb641d0c662a98a3474c6a2a441476b5a1 [revision 559] Author: Sam Hocevar Date: Sun Sep 10 17:16:21 2006 +0000 * Add the autotool's config.guess script so that we can use it instead of uname in the configure script. git-svn-id: svn://svn.videolan.org/x264/trunk@559 df754926-b1dd-0310-bc7b-ec298dee348c commit 4a5a03bbba8a8100d84bf5c30709eec133dda282 [revision 558] Author: Loren Merritt Date: Tue Aug 22 07:43:14 2006 +0000 10l in r553 git-svn-id: svn://svn.videolan.org/x264/trunk@558 df754926-b1dd-0310-bc7b-ec298dee348c commit 73657d88d1e1371d684eea805fb88c008e44e96b [revision 557] Author: Loren Merritt Date: Mon Aug 21 00:46:20 2006 +0000 ssim broke on amd64 w/ pic. git-svn-id: svn://svn.videolan.org/x264/trunk@557 df754926-b1dd-0310-bc7b-ec298dee348c commit 1808700e26023fccfabec9e65ee7e4fb18ae57f2 [revision 556] Author: Steve Lhomme Date: Fri Aug 18 20:50:10 2006 +0000 MSVC compatibility fix from Haali git-svn-id: svn://svn.videolan.org/x264/trunk@556 df754926-b1dd-0310-bc7b-ec298dee348c commit f78c224c21e991b55deb637d35b30b06a78d78da [revision 555] Author: Loren Merritt Date: Thu Aug 17 22:49:45 2006 +0000 support changing some more parameters in x264_encoder_reconfig() git-svn-id: svn://svn.videolan.org/x264/trunk@555 df754926-b1dd-0310-bc7b-ec298dee348c commit 7760f1b2e78360542e31eb55db81e84dcb4f95ac [revision 554] Author: Loren Merritt Date: Thu Aug 17 21:57:59 2006 +0000 SSIM computation. (default on, disable by --no-ssim) git-svn-id: svn://svn.videolan.org/x264/trunk@554 df754926-b1dd-0310-bc7b-ec298dee348c commit 127e2fbf0a338549b00f6a3022ce1d2bab1d2acb [revision 553] Author: Loren Merritt Date: Wed Aug 16 20:13:06 2006 +0000 configure: --enable-debug reduces optimization to -O1 git-svn-id: svn://svn.videolan.org/x264/trunk@553 df754926-b1dd-0310-bc7b-ec298dee348c commit dc5d530e13e0aa38218feb7c6c1fc4b75c0b7261 [revision 552] Author: Loren Merritt Date: Wed Aug 16 19:57:08 2006 +0000 cosmetics git-svn-id: svn://svn.videolan.org/x264/trunk@552 df754926-b1dd-0310-bc7b-ec298dee348c commit 12bd065367f7aa0a4efc550dd595142d783a9525 [revision 551] Author: Loren Merritt Date: Fri Aug 4 03:12:43 2006 +0000 gcc -fprofile-generate isn't threadsafe git-svn-id: svn://svn.videolan.org/x264/trunk@551 df754926-b1dd-0310-bc7b-ec298dee348c commit b3f15918ac03a0fc3b6fe1a8311fabedf5fa6b53 [revision 550] Author: Loren Merritt Date: Thu Aug 3 19:49:17 2006 +0000 cli: move some options from --help to --longhelp git-svn-id: svn://svn.videolan.org/x264/trunk@550 df754926-b1dd-0310-bc7b-ec298dee348c commit 64a8b781013148bd351c2a45fdcb6a9aaf26ff4a [revision 549] Author: Loren Merritt Date: Thu Aug 3 18:22:08 2006 +0000 cli: don't try to get resolution from filename unless input is rawyuv git-svn-id: svn://svn.videolan.org/x264/trunk@549 df754926-b1dd-0310-bc7b-ec298dee348c commit de41ef2605bf1d5ded851888438a55a6bf66c42a [revision 548] Author: Loren Merritt Date: Thu Aug 3 18:13:56 2006 +0000 r542 broke --visualize git-svn-id: svn://svn.videolan.org/x264/trunk@548 df754926-b1dd-0310-bc7b-ec298dee348c commit 1ff80a23db8912967238d1d9d4df2aebeeafbd1a [revision 547] Author: Eric Petit Date: Wed Aug 2 18:11:21 2006 +0000 Nicer OS X x264_cpu_num_processors (thanks David) git-svn-id: svn://svn.videolan.org/x264/trunk@547 df754926-b1dd-0310-bc7b-ec298dee348c commit 3493a54ddfd504fdd993c98a321916f98b96b09d [revision 546] Author: Eric Petit Date: Tue Aug 1 15:20:35 2006 +0000 Support OS X and BeOS in x264_cpu_num_processors git-svn-id: svn://svn.videolan.org/x264/trunk@546 df754926-b1dd-0310-bc7b-ec298dee348c commit 86ec16c126168adbb6f615159e26065b4b0000a7 [revision 545] Author: Eric Petit Date: Tue Aug 1 15:18:31 2006 +0000 Fixes contexts allocation with threads=auto git-svn-id: svn://svn.videolan.org/x264/trunk@545 df754926-b1dd-0310-bc7b-ec298dee348c commit eeebca20ad854271fab898f306f1657887ca6588 [revision 544] Author: Loren Merritt Date: Tue Aug 1 02:22:36 2006 +0000 select initial qp for abr and cbr baased on satd and bitrate, rather than cq24. git-svn-id: svn://svn.videolan.org/x264/trunk@544 df754926-b1dd-0310-bc7b-ec298dee348c commit 9e9a869aa7dc3c39a883a5a5886c0bbd82e8b95c [revision 543] Author: Loren Merritt Date: Tue Aug 1 00:17:18 2006 +0000 --threads=auto to detect number of cpus git-svn-id: svn://svn.videolan.org/x264/trunk@543 df754926-b1dd-0310-bc7b-ec298dee348c commit 0b07708cdab6c2f54673930c10d2908c53e19120 [revision 542] Author: Loren Merritt Date: Mon Jul 31 21:59:04 2006 +0000 api addition: x264_param_parse() to set options by name git-svn-id: svn://svn.videolan.org/x264/trunk@542 df754926-b1dd-0310-bc7b-ec298dee348c commit 99b8471e5032bfab1f81cb04fb350ee6fa878561 [revision 541] Author: Loren Merritt Date: Mon Jul 31 06:34:53 2006 +0000 fix a rare NaN in ratecontrol git-svn-id: svn://svn.videolan.org/x264/trunk@541 df754926-b1dd-0310-bc7b-ec298dee348c commit adc4b4f85e3682bd4868df21babcea725c919bef [revision 540] Author: Loren Merritt Date: Sun Jul 30 02:39:05 2006 +0000 move quant_mf[] from x264_t to the heap, and merge duplicate entries git-svn-id: svn://svn.videolan.org/x264/trunk@540 df754926-b1dd-0310-bc7b-ec298dee348c commit 75d6edb847722b1058914a2effcc6d47c5b7971a [revision 539] Author: Loren Merritt Date: Fri Jul 28 21:39:07 2006 +0000 GTK update. patch by Vincent Torri. fixed: cleaning of Makefile time elapsed seems broken ('total time' label replaced by 'time remaining') text entries of the status window are now not editable added: compilation from x264/ (add --enable-gtk option to configure) shared lib creation if --enable-shared is passed to configure x264gtk.pc --b-rdo, --no-dct-decimate git-svn-id: svn://svn.videolan.org/x264/trunk@539 df754926-b1dd-0310-bc7b-ec298dee348c commit 4b2556f688c482e2cbda025a5468bb4853810f89 [revision 538] Author: Loren Merritt Date: Sun Jul 23 19:19:40 2006 +0000 new option: --qpfile forces frames types and QPs. (intended for ratecontrol experiments, not for real encodes) git-svn-id: svn://svn.videolan.org/x264/trunk@538 df754926-b1dd-0310-bc7b-ec298dee348c commit b8692db2efa98c719332979f6dc8fc39af8f1eff [revision 537] Author: Loren Merritt Date: Tue Jul 18 01:10:54 2006 +0000 api change: select ratecontrol method with an enum (param.rc.i_rc_method) instead of a bunch of booleans. git-svn-id: svn://svn.videolan.org/x264/trunk@537 df754926-b1dd-0310-bc7b-ec298dee348c commit 4e7bd4a1f038aeb538bfd07fc2d5ac67c041a0e9 [revision 536] Author: Loren Merritt Date: Sun Jul 16 18:28:39 2006 +0000 slightly faster mmx dct git-svn-id: svn://svn.videolan.org/x264/trunk@536 df754926-b1dd-0310-bc7b-ec298dee348c commit 637470c0dbbadf4b0d8b01c6c2179a4305f2b203 [revision 535] Author: Loren Merritt Date: Sun Jul 16 18:25:38 2006 +0000 OpenBSD build fixes. patch by Vizeli Pascal (pvizeli at yahoo dot de) git-svn-id: svn://svn.videolan.org/x264/trunk@535 df754926-b1dd-0310-bc7b-ec298dee348c commit c6213d2fbb3b9a1a23c7cbe023080e98ee8e35a6 [revision 534] Author: Loren Merritt Date: Sat Jul 8 17:56:22 2006 +0000 mc_chroma width2 mmx git-svn-id: svn://svn.videolan.org/x264/trunk@534 df754926-b1dd-0310-bc7b-ec298dee348c commit eff6a5e204a9d9789b44d1ebf3de510f0f6d4334 [revision 533] Author: Måns Rullgård Date: Wed Jun 28 21:58:58 2006 +0000 make libx264.so symlink relative git-svn-id: svn://svn.videolan.org/x264/trunk@533 df754926-b1dd-0310-bc7b-ec298dee348c commit 360853117fccdd082a6654c34c2c7dc2577f10cd [revision 532] Author: Loren Merritt Date: Mon Jun 12 08:22:09 2006 +0000 GTK update. patch by Vincent Torri. added: direct=auto no-fast-pskip vbv cqm tooltips (without descriptions yet) translations `make clean` for .exe when file exists, ask for override fixes: debug level bug bitrate slider bug mixed-refs can be set only if ref>1 i8x8 can be set only if 8x8 transform is enabled # of threads capped at 4 fourcc can't be removed cosmetics git-svn-id: svn://svn.videolan.org/x264/trunk@532 df754926-b1dd-0310-bc7b-ec298dee348c commit 918c7ef4540f8741ecc23d74f227abd594986fba [revision 531] Author: Loren Merritt Date: Wed May 31 23:55:35 2006 +0000 vfw installer: tweak nsis compression. patch by Francesco Corriga. git-svn-id: svn://svn.videolan.org/x264/trunk@531 df754926-b1dd-0310-bc7b-ec298dee348c commit b5d08311537cf047b1720243f866264abc5150d3 [revision 530] Author: Eric Petit Date: Tue May 30 10:05:56 2006 +0000 Fixed typo that caused x264_encoder_open to always fail git-svn-id: svn://svn.videolan.org/x264/trunk@530 df754926-b1dd-0310-bc7b-ec298dee348c commit 91bbfb98c003715a0a44c31053c6e361b5262995 [revision 529] Author: Loren Merritt Date: Tue May 30 07:07:55 2006 +0000 check some mallocs' return value git-svn-id: svn://svn.videolan.org/x264/trunk@529 df754926-b1dd-0310-bc7b-ec298dee348c commit 0ac281dc97ac8c0b85165062d5f25227c9e0142f [revision 528] Author: Loren Merritt Date: Sun May 28 21:14:24 2006 +0000 make -> $(MAKE) git-svn-id: svn://svn.videolan.org/x264/trunk@528 df754926-b1dd-0310-bc7b-ec298dee348c commit c9155d3e9fbe646e2504c09b5b751f7279900522 [revision 527] Author: Loren Merritt Date: Wed May 24 03:59:19 2006 +0000 convert non-fatal errors to message level "warning". git-svn-id: svn://svn.videolan.org/x264/trunk@527 df754926-b1dd-0310-bc7b-ec298dee348c commit ae4b97a2530da93b750ac6b525941d408294a216 [revision 526] Author: Loren Merritt Date: Mon May 22 23:07:58 2006 +0000 fix a memory alignment. (no effect on x86, but might be needed for other simd) git-svn-id: svn://svn.videolan.org/x264/trunk@526 df754926-b1dd-0310-bc7b-ec298dee348c commit c832ac1af21e82f7418077c6e61d13819420fd61 [revision 525] Author: Loren Merritt Date: Fri May 19 20:10:41 2006 +0000 when using DEBUG_DUMP_FRAME, write decoded pictures in display order. patch by Loic Le Loarer. git-svn-id: svn://svn.videolan.org/x264/trunk@525 df754926-b1dd-0310-bc7b-ec298dee348c commit 361b283acbcc2d4b1ff8171ac2945449936e5b93 [revision 524] Author: Loren Merritt Date: Fri May 19 19:14:29 2006 +0000 non-referenced B-frames should have the same frame_num as the following ref frame, not the previous. patch by Loic Le Loarer. git-svn-id: svn://svn.videolan.org/x264/trunk@524 df754926-b1dd-0310-bc7b-ec298dee348c commit 1ca108eb75a947ad86dc4ab88ecf6e69bd22d358 [revision 523] Author: Loren Merritt Date: Fri May 12 08:17:53 2006 +0000 set the SPS constraint_set[01]_flag based on the profile in use, just in case some decoder cares git-svn-id: svn://svn.videolan.org/x264/trunk@523 df754926-b1dd-0310-bc7b-ec298dee348c commit de1af4c2e5528fae5918ddf00d6ae09a68ea2222 [revision 522] Author: Loren Merritt Date: Wed May 10 16:47:05 2006 +0000 msvc doesn't like C99 named array initializers git-svn-id: svn://svn.videolan.org/x264/trunk@522 df754926-b1dd-0310-bc7b-ec298dee348c commit 7022fe85d84db4a5a95a7b4ac699423277638881 [revision 521] Author: Loren Merritt Date: Wed May 10 16:42:07 2006 +0000 allow sar=1/1. patch by Loic Le Loarer. git-svn-id: svn://svn.videolan.org/x264/trunk@521 df754926-b1dd-0310-bc7b-ec298dee348c commit 0d88274d5d833030ee3f54dfd09ef039f341da01 [revision 520] Author: Loren Merritt Date: Wed May 10 06:09:48 2006 +0000 faster intra search: filter i8x8 edges only once, and reuse for multiple predictions. git-svn-id: svn://svn.videolan.org/x264/trunk@520 df754926-b1dd-0310-bc7b-ec298dee348c commit 3de28cd5878a8be64e1db80ff2453e79acb0040d [revision 519] Author: Loren Merritt Date: Tue May 9 06:11:42 2006 +0000 faster intra search: some prediction modes don't have to compute a full hadamard transform. x86 and amd64 asm. git-svn-id: svn://svn.videolan.org/x264/trunk@519 df754926-b1dd-0310-bc7b-ec298dee348c commit e63f25b44ed1bfd2ccd0ff1e7f1f453c6ba08179 [revision 518] Author: Loren Merritt Date: Sat May 6 17:12:23 2006 +0000 --sps-id, to allow concatenating streams with different settings. git-svn-id: svn://svn.videolan.org/x264/trunk@518 df754926-b1dd-0310-bc7b-ec298dee348c commit 609deaf54ad74816ba061c7022172986694049c9 [revision 517] Author: Loren Merritt Date: Wed May 3 17:59:23 2006 +0000 typo in expand_border_mod16 git-svn-id: svn://svn.videolan.org/x264/trunk@517 df754926-b1dd-0310-bc7b-ec298dee348c commit f5bdc82806070eb101f3c6ab9a5370c4788d7597 [revision 516] Author: Loren Merritt Date: Sun Apr 30 01:21:49 2006 +0000 typo impaired 2pass bitrate prediction. git-svn-id: svn://svn.videolan.org/x264/trunk@516 df754926-b1dd-0310-bc7b-ec298dee348c commit f51297f19065e4ab34179e6e1d785b28fb3ad6be [revision 515] Author: Eric Petit Date: Sat Apr 29 11:13:04 2006 +0000 Let the user choose the compiler with "CC=xxx ./configure" git-svn-id: svn://svn.videolan.org/x264/trunk@515 df754926-b1dd-0310-bc7b-ec298dee348c commit e7141289a2f8d07168d19fc751bb302a9c32a79e [revision 514] Author: Eric Petit Date: Sat Apr 29 11:12:16 2006 +0000 More vector types fixes for gcc 3.3 git-svn-id: svn://svn.videolan.org/x264/trunk@514 df754926-b1dd-0310-bc7b-ec298dee348c commit f3323f8478176852ff8e974217cb59227bbb693e [revision 513] Author: Eric Petit Date: Fri Apr 28 17:13:37 2006 +0000 More vector casts to try and make compilers happier git-svn-id: svn://svn.videolan.org/x264/trunk@513 df754926-b1dd-0310-bc7b-ec298dee348c commit c1f64a50b7563b737c8938ed796f46d3bad354a4 [revision 512] Author: Loren Merritt Date: Tue Apr 25 04:08:21 2006 +0000 Use sa8d instead of satd for i8x8 search. +.01 dB, -.5% speed git-svn-id: svn://svn.videolan.org/x264/trunk@512 df754926-b1dd-0310-bc7b-ec298dee348c commit 8aa294381e5e7725a9ae01ce84d1a8f4ac86a8eb [revision 511] Author: Loren Merritt Date: Mon Apr 24 19:34:06 2006 +0000 Before evaluating the RD score of any mode, check satd and abort if it's much worse than some other mode. Also apply more early termination to intra search. speed at -m1:+1%, -m4:+3%, -m6:+8%, -m7:+20% git-svn-id: svn://svn.videolan.org/x264/trunk@511 df754926-b1dd-0310-bc7b-ec298dee348c commit eb3d83c0a32636674f59bbd7f8fedd430f1e4c2c [revision 510] Author: Sam Hocevar Date: Mon Apr 24 19:01:10 2006 +0000 * common/ppc/pixel.c: fixed illegal implicit casts of vector types. git-svn-id: svn://svn.videolan.org/x264/trunk@510 df754926-b1dd-0310-bc7b-ec298dee348c commit 17b90bf3c3f6a2c5711b68be39df784204b425b8 [revision 509] Author: Sam Hocevar Date: Mon Apr 24 18:49:50 2006 +0000 * Added %$#@#$! support for #@%$!#@ armv4l CPU. git-svn-id: svn://svn.videolan.org/x264/trunk@509 df754926-b1dd-0310-bc7b-ec298dee348c commit 35a5a4f121667717bfd783d3dcbb6e79fc7c8668 [revision 508] Author: Loren Merritt Date: Mon Apr 24 08:27:26 2006 +0000 When evaluating predictors to start fullpel motion search, use subpel positions instead of rounding to fullpel. about +.02 dB, -1.6% speed at subme>=3 patch by Alex Wright. git-svn-id: svn://svn.videolan.org/x264/trunk@508 df754926-b1dd-0310-bc7b-ec298dee348c commit 6b577361fbab9d785787eba3e16a63a23d84be28 [revision 507] Author: Loren Merritt Date: Mon Apr 24 03:52:55 2006 +0000 mmx implementation of x264_pixel_sa8d git-svn-id: svn://svn.videolan.org/x264/trunk@507 df754926-b1dd-0310-bc7b-ec298dee348c commit af751ac37f6397567696ba7eb2479f72ea2c2004 [revision 506] Author: Loren Merritt Date: Thu Apr 20 23:48:46 2006 +0000 10l in r463 (q0 i16x16 dc was permuted) git-svn-id: svn://svn.videolan.org/x264/trunk@506 df754926-b1dd-0310-bc7b-ec298dee348c commit 2735ae7056413fb6d4461618269b971e8ae45915 [revision 505] Author: Loren Merritt Date: Thu Apr 20 20:33:25 2006 +0000 typo in r504 git-svn-id: svn://svn.videolan.org/x264/trunk@505 df754926-b1dd-0310-bc7b-ec298dee348c commit 656b0698c90b7c3b3365416391b6d3706d1afb0d [revision 504] Author: Loren Merritt Date: Thu Apr 20 04:38:45 2006 +0000 update msvc project files. patch by anonymous. git-svn-id: svn://svn.videolan.org/x264/trunk@504 df754926-b1dd-0310-bc7b-ec298dee348c commit 2f95856be50ec7c744ad4d65408846d3dce75491 [revision 503] Author: Loren Merritt Date: Wed Apr 19 09:02:19 2006 +0000 Before, we eliminated dct blocks containing only a small single coefficient. Now that behavior is optional, by --no-dct-decimate. based on a patch by Alex Wright. git-svn-id: svn://svn.videolan.org/x264/trunk@503 df754926-b1dd-0310-bc7b-ec298dee348c commit 9a6815d3cb06e48888bdd1d804ae9eb72adf71bb [revision 502] Author: Eric Petit Date: Mon Apr 17 11:08:58 2006 +0000 Enables more agressive optimizations (-fastf -mcpu=G4) on OS X. Adds AltiVec interleaved SAD and SSD16x16. Overall speedup up to 20%. Patch by anonymous git-svn-id: svn://svn.videolan.org/x264/trunk@502 df754926-b1dd-0310-bc7b-ec298dee348c commit 97ab2190599297ab0edaa62b8a7027117ca74ed5 [revision 501] Author: Loren Merritt Date: Mon Apr 17 01:19:47 2006 +0000 faster cabac_encode_bypass git-svn-id: svn://svn.videolan.org/x264/trunk@501 df754926-b1dd-0310-bc7b-ec298dee348c commit 1927b27504ebee226b5eb1a51461b35e72a6d542 [revision 500] Author: Eric Petit Date: Sun Apr 16 18:24:38 2006 +0000 restored AltiVec dct git-svn-id: svn://svn.videolan.org/x264/trunk@500 df754926-b1dd-0310-bc7b-ec298dee348c commit d0a556549b9ae59c3a30b6c1b0280e6857350da3 [revision 499] Author: Eric Petit Date: Sun Apr 16 16:38:16 2006 +0000 more AltiVec mc, ~4.5% overall speedup git-svn-id: svn://svn.videolan.org/x264/trunk@499 df754926-b1dd-0310-bc7b-ec298dee348c commit 71f11146131d1804311d86535a6aa7d0ff777501 [revision 498] Author: Loren Merritt Date: Wed Apr 12 21:21:59 2006 +0000 slightly faster loopfilter git-svn-id: svn://svn.videolan.org/x264/trunk@498 df754926-b1dd-0310-bc7b-ec298dee348c commit d2ab724f262f831a320ba75b81092bc182bca695 [revision 497] Author: Loren Merritt Date: Wed Apr 12 06:28:52 2006 +0000 3% faster satd_mmx git-svn-id: svn://svn.videolan.org/x264/trunk@497 df754926-b1dd-0310-bc7b-ec298dee348c commit a23a3678b474450876ac297c979bb2ad27afe6f4 [revision 496] Author: Loren Merritt Date: Wed Apr 12 00:45:07 2006 +0000 cosmetics in sad/ssd/satd mmx git-svn-id: svn://svn.videolan.org/x264/trunk@496 df754926-b1dd-0310-bc7b-ec298dee348c commit b3ad52d4860127a9b7348923671b595b98cb4d09 [revision 495] Author: Loren Merritt Date: Tue Apr 11 21:16:44 2006 +0000 store quoted configure options. needed e.g. for multiple args under --extra-cflags. git-svn-id: svn://svn.videolan.org/x264/trunk@495 df754926-b1dd-0310-bc7b-ec298dee348c commit c293b80b3b57789bbac663f0eac0ee26c1bd8eec [revision 494] Author: Loren Merritt Date: Tue Apr 11 10:45:00 2006 +0000 fix a yasm-incompatible syntax in x86 asm git-svn-id: svn://svn.videolan.org/x264/trunk@494 df754926-b1dd-0310-bc7b-ec298dee348c commit ae82d2423aa1d54eb367ca292262bc6bd3dec134 [revision 493] Author: Loren Merritt Date: Tue Apr 11 02:41:43 2006 +0000 yasm noexec stack git-svn-id: svn://svn.videolan.org/x264/trunk@493 df754926-b1dd-0310-bc7b-ec298dee348c commit 283d57ed95a22f104e4f22a6119e2aaeeca39833 [revision 492] Author: Loren Merritt Date: Mon Apr 10 18:46:54 2006 +0000 more interleaved SAD. 25% faster halfpel. git-svn-id: svn://svn.videolan.org/x264/trunk@492 df754926-b1dd-0310-bc7b-ec298dee348c commit 0f4c0eb836912fcbd2376c920a9dd7bf438f4e43 [revision 491] Author: Loren Merritt Date: Mon Apr 10 17:56:02 2006 +0000 more interleaved SAD. 1% faster umh, 6% faster esa. git-svn-id: svn://svn.videolan.org/x264/trunk@491 df754926-b1dd-0310-bc7b-ec298dee348c commit 8947b51f35151f821c3718b01c1e93d517d814b5 [revision 490] Author: Loren Merritt Date: Mon Apr 10 03:03:13 2006 +0000 interleave multiple calls to SAD. 15% faster fullpel motion estimation. git-svn-id: svn://svn.videolan.org/x264/trunk@490 df754926-b1dd-0310-bc7b-ec298dee348c commit bddf5f03ff621a8fdbcc9925453573600984b27d [revision 489] Author: Sam Hocevar Date: Sun Apr 9 13:20:17 2006 +0000 * Added support for ppc64. I'm really fucking tired of having to do this. git-svn-id: svn://svn.videolan.org/x264/trunk@489 df754926-b1dd-0310-bc7b-ec298dee348c commit ac4249d20d47f75ca9aebb04c4b329c2d497100c [revision 488] Author: Måns Rullgård Date: Sat Apr 8 01:33:13 2006 +0000 use LDFLAGS when linking shared lib git-svn-id: svn://svn.videolan.org/x264/trunk@488 df754926-b1dd-0310-bc7b-ec298dee348c commit 85ab23ceca033df09e17d650647cba3d4995f9e8 [revision 487] Author: Felix Paul Kühne Date: Wed Mar 29 06:37:55 2006 +0000 * compilation fix for mingw, darwin (off_t was undefined) git-svn-id: svn://svn.videolan.org/x264/trunk@487 df754926-b1dd-0310-bc7b-ec298dee348c commit 540ed9aafdf7577cf51914676dfc010952c76052 [revision 486] Author: Loren Merritt Date: Mon Mar 27 08:11:37 2006 +0000 GTK: support yuv4mpeg input. patch by Vincent Torri. git-svn-id: svn://svn.videolan.org/x264/trunk@486 df754926-b1dd-0310-bc7b-ec298dee348c commit 5fe3e77f3d9f20286feb1432a63d2d6652dc8777 [revision 485] Author: Loren Merritt Date: Sun Mar 26 20:54:33 2006 +0000 GTK: fix avs input patch by Vincent Torri. git-svn-id: svn://svn.videolan.org/x264/trunk@485 df754926-b1dd-0310-bc7b-ec298dee348c commit a84899e0b4fd16d49cb7085a275cd7bb1ce9f67c [revision 484] Author: Loren Merritt Date: Sun Mar 26 20:40:20 2006 +0000 cli: support yuv4mpeg input. patch by anonymous. git-svn-id: svn://svn.videolan.org/x264/trunk@484 df754926-b1dd-0310-bc7b-ec298dee348c commit e510c09889d83c5d7e85eca83036af36f1284b87 [revision 483] Author: Loren Merritt Date: Sun Mar 26 01:09:09 2006 +0000 GTK: compilation fixes git-svn-id: svn://svn.videolan.org/x264/trunk@483 df754926-b1dd-0310-bc7b-ec298dee348c commit 845feff6507dc682944a8c733f8ee1c8a4da6f09 [revision 482] Author: Loren Merritt Date: Sat Mar 25 23:26:07 2006 +0000 GTK: compilation fixes on mingw, add avs input for the app (if avalaible), add filters for the filechooser, add icon for the main window. patch by Vincent Torri. git-svn-id: svn://svn.videolan.org/x264/trunk@482 df754926-b1dd-0310-bc7b-ec298dee348c commit fc34d38657b89418cb68576889c647a4aa5e8108 [revision 481] Author: Loren Merritt Date: Sat Mar 25 10:13:12 2006 +0000 GTK-based graphical frontend. patch by Vincent Torri. git-svn-id: svn://svn.videolan.org/x264/trunk@481 df754926-b1dd-0310-bc7b-ec298dee348c commit 50aadebc2de139f2003314bfbeea7d7ce1680901 [revision 480] Author: Loren Merritt Date: Sat Mar 25 10:06:15 2006 +0000 silence some gcc warnings git-svn-id: svn://svn.videolan.org/x264/trunk@480 df754926-b1dd-0310-bc7b-ec298dee348c commit 5b6c5effb4c1864b692213cafef9c85e3623573c [revision 479] Author: Loren Merritt Date: Fri Mar 24 21:45:39 2006 +0000 use FDEC_STRIDE instead of a parameter in mmx dct .5% speedup git-svn-id: svn://svn.videolan.org/x264/trunk@479 df754926-b1dd-0310-bc7b-ec298dee348c commit da9158b3ec035e8261e6fe2c5fd77e073425ed08 [revision 478] Author: Sam Hocevar Date: Wed Mar 22 14:21:53 2006 +0000 * configure: support for 64 bits MIPS. git-svn-id: svn://svn.videolan.org/x264/trunk@478 df754926-b1dd-0310-bc7b-ec298dee348c commit 7c013538206535c3abd70eb56a00bed0dccb43c5 [revision 477] Author: Loren Merritt Date: Mon Mar 20 23:32:43 2006 +0000 10l in r473 and stdin git-svn-id: svn://svn.videolan.org/x264/trunk@477 df754926-b1dd-0310-bc7b-ec298dee348c commit 3b66f690bd8a7d1417cedf98aec0df2702338bb2 [revision 476] Author: Loren Merritt Date: Mon Mar 20 23:00:52 2006 +0000 RD subpel motion estimation (--subme 7) git-svn-id: svn://svn.videolan.org/x264/trunk@476 df754926-b1dd-0310-bc7b-ec298dee348c commit 48633d2afd50d9f15b83e6639024c382dd958c76 [revision 475] Author: Loren Merritt Date: Mon Mar 20 22:46:38 2006 +0000 cosmetics in cabac_mb_cbf git-svn-id: svn://svn.videolan.org/x264/trunk@475 df754926-b1dd-0310-bc7b-ec298dee348c commit 50f40fd2a3b63ef89e8f94085ef2ed971a408468 [revision 474] Author: Loren Merritt Date: Sun Mar 19 11:00:02 2006 +0000 separate --thread-input from --threads git-svn-id: svn://svn.videolan.org/x264/trunk@474 df754926-b1dd-0310-bc7b-ec298dee348c commit 36c25b664e28e4e15ce49242af6ac306eb6f7cca [revision 473] Author: Loren Merritt Date: Sat Mar 18 09:20:06 2006 +0000 if --threads > 1, then read the input stream in its own thread. git-svn-id: svn://svn.videolan.org/x264/trunk@473 df754926-b1dd-0310-bc7b-ec298dee348c commit c20906a56b2f17a89b7ba4afc87e78202447e7fb [revision 472] Author: Måns Rullgård Date: Fri Mar 17 22:27:31 2006 +0000 FreeBSD uses ELF git-svn-id: svn://svn.videolan.org/x264/trunk@472 df754926-b1dd-0310-bc7b-ec298dee348c commit 9c61b0cbd68304b8f2860dc2d7df401ad6839b81 [revision 471] Author: Loren Merritt Date: Fri Mar 17 22:10:34 2006 +0000 10l in r470 on x86_64 git-svn-id: svn://svn.videolan.org/x264/trunk@471 df754926-b1dd-0310-bc7b-ec298dee348c commit fdb64099b4da93ffa70af98aad85cc7c6fc564d0 [revision 470] Author: Loren Merritt Date: Fri Mar 17 21:36:27 2006 +0000 some mmxext functions really only required mmx. git-svn-id: svn://svn.videolan.org/x264/trunk@470 df754926-b1dd-0310-bc7b-ec298dee348c commit abffd18fe30bcc0daa344a7dcedab30ddc3e97f6 [revision 469] Author: Loren Merritt Date: Fri Mar 17 07:36:06 2006 +0000 simplify get_ref and mc_luma git-svn-id: svn://svn.videolan.org/x264/trunk@469 df754926-b1dd-0310-bc7b-ec298dee348c commit 9eaa83d4c97cfceaeb6491ae8e7a74c0bd6f397b [revision 468] Author: Loren Merritt Date: Wed Mar 15 04:38:57 2006 +0000 b16x16 wpred analysis used wrong weight git-svn-id: svn://svn.videolan.org/x264/trunk@468 df754926-b1dd-0310-bc7b-ec298dee348c commit 926212b3f0ff67ffc8ea2e3a7b299c016a00404c [revision 467] Author: Loren Merritt Date: Sat Mar 11 03:32:37 2006 +0000 configure: --enable-shared for libx264.so git-svn-id: svn://svn.videolan.org/x264/trunk@467 df754926-b1dd-0310-bc7b-ec298dee348c commit 8f79dcc217245ebdc4aba8be505526c7277c6d3c [revision 466] Author: Loren Merritt Date: Fri Mar 10 18:58:29 2006 +0000 wrong modulus when delta_qp = +26 git-svn-id: svn://svn.videolan.org/x264/trunk@466 df754926-b1dd-0310-bc7b-ec298dee348c commit 78f414d5646e018254339a9a5db08bdf69de6551 [revision 465] Author: Loren Merritt Date: Thu Mar 9 16:56:33 2006 +0000 10l in vbv + 2pass git-svn-id: svn://svn.videolan.org/x264/trunk@465 df754926-b1dd-0310-bc7b-ec298dee348c commit d8e790ca7c0a524ea0aa01bf0d9020530e3dba9a [revision 464] Author: Loren Merritt Date: Thu Mar 9 15:59:08 2006 +0000 macroblock-level ratecontrol: improved vbv strictness, and improved quality when using vbv. git-svn-id: svn://svn.videolan.org/x264/trunk@464 df754926-b1dd-0310-bc7b-ec298dee348c commit 79389771d6bc84a886b754bba995e7d9ac8b48d4 [revision 463] Author: Loren Merritt Date: Thu Mar 9 05:30:08 2006 +0000 keep transposed dct coefs. ~1% overall speedup. git-svn-id: svn://svn.videolan.org/x264/trunk@463 df754926-b1dd-0310-bc7b-ec298dee348c commit ce9b3336d66cff23019d43656baf425491702727 [revision 462] Author: Loren Merritt Date: Thu Mar 9 05:24:02 2006 +0000 tweak rounding of 8x8dct git-svn-id: svn://svn.videolan.org/x264/trunk@462 df754926-b1dd-0310-bc7b-ec298dee348c commit 9331f05948dfcd42461a8aa8b6f0994e594dc74a [revision 461] Author: Loren Merritt Date: Wed Mar 8 19:58:30 2006 +0000 cosmetics in makefile git-svn-id: svn://svn.videolan.org/x264/trunk@461 df754926-b1dd-0310-bc7b-ec298dee348c commit 058c3be5405df19bb0f029956c54251d626ca0f0 [revision 460] Author: Loren Merritt Date: Wed Mar 8 16:03:30 2006 +0000 cosmetics: muxers -> muxers.c git-svn-id: svn://svn.videolan.org/x264/trunk@460 df754926-b1dd-0310-bc7b-ec298dee348c commit a34eec24d5caed8b43f2d1ecf7a0f36b9fe60189 [revision 459] Author: Loren Merritt Date: Mon Mar 6 18:38:44 2006 +0000 no --nr in intra blocks. intra prediction doesn't work well enough for the residual to be indicative of noise. git-svn-id: svn://svn.videolan.org/x264/trunk@459 df754926-b1dd-0310-bc7b-ec298dee348c commit afbbaf9b0229751fe545e1ac8b8f1ca68228d56a [revision 458] Author: Loren Merritt Date: Mon Mar 6 03:21:38 2006 +0000 10l in direct auto + multiref + 1pass git-svn-id: svn://svn.videolan.org/x264/trunk@458 df754926-b1dd-0310-bc7b-ec298dee348c commit 9773268370492490235dee06d46e091f563626d7 [revision 457] Author: Loren Merritt Date: Sun Mar 5 07:01:58 2006 +0000 --direct auto selects direct mode per frame. works best in 2pass (enable in both passes). git-svn-id: svn://svn.videolan.org/x264/trunk@457 df754926-b1dd-0310-bc7b-ec298dee348c commit 126ccb3360e4c9b92ced4c995e618e4129be97a2 [revision 456] Author: Loren Merritt Date: Sun Mar 5 06:59:06 2006 +0000 change default direct mode to spatial git-svn-id: svn://svn.videolan.org/x264/trunk@456 df754926-b1dd-0310-bc7b-ec298dee348c commit 39c6d0824e23a3e6e769812968082026a9df61f8 [revision 455] Author: Loren Merritt Date: Sun Mar 5 06:28:40 2006 +0000 remove TODO. most of it is done, and the rest is out of date. git-svn-id: svn://svn.videolan.org/x264/trunk@455 df754926-b1dd-0310-bc7b-ec298dee348c commit 918a791bb4ed544368cd7389147ab9e18fb6f8d4 [revision 454] Author: Loren Merritt Date: Sun Mar 5 02:26:32 2006 +0000 more amd64 mmx intra prediction git-svn-id: svn://svn.videolan.org/x264/trunk@454 df754926-b1dd-0310-bc7b-ec298dee348c commit 469b4e5032b8183dd04c8cb6f22e9724146bb2f5 [revision 453] Author: Loren Merritt Date: Sun Mar 5 02:16:19 2006 +0000 for i8x8 neighbors, don't assume a new slice starts at the edge of the frame git-svn-id: svn://svn.videolan.org/x264/trunk@453 df754926-b1dd-0310-bc7b-ec298dee348c commit 4fe3aff6fd991a72b8a80b2157092678b13db433 [revision 452] Author: Sam Hocevar Date: Sat Mar 4 02:49:44 2006 +0000 * common/i386/i386inc.asm: got PIC to work for real on OS X x86. git-svn-id: svn://svn.videolan.org/x264/trunk@452 df754926-b1dd-0310-bc7b-ec298dee348c commit 4139febfe8acad10fb759b9d5a8992ed8cad6234 [revision 451] Author: Sam Hocevar Date: Thu Mar 2 20:48:08 2006 +0000 * common/i386/*.asm: don't use the "GLOBAL" reserved word, some versions NASM complain about it. Replaced it with "GOT_ebx". git-svn-id: svn://svn.videolan.org/x264/trunk@451 df754926-b1dd-0310-bc7b-ec298dee348c commit 059410eda5d125c166b5ba050e3ca8152e84a2c7 [revision 450] Author: Sam Hocevar Date: Thu Mar 2 20:46:54 2006 +0000 * configure: activate minor nasm optimisations, such as assembling "add eax, 8" as "add eax, byte 8". git-svn-id: svn://svn.videolan.org/x264/trunk@450 df754926-b1dd-0310-bc7b-ec298dee348c commit b5fda5741bf36e3f03cf0ebac90ddd9dce4f6420 [revision 449] Author: Sam Hocevar Date: Wed Mar 1 22:21:47 2006 +0000 * common/i386: factored the .rodata section declaration into i386inc.asm. git-svn-id: svn://svn.videolan.org/x264/trunk@449 df754926-b1dd-0310-bc7b-ec298dee348c commit 17683d75b1dae07e8dfc901883231b41e73fe8cd [revision 448] Author: Sam Hocevar Date: Wed Mar 1 22:16:16 2006 +0000 * configure common/i386/i386inc.asm: got rid of -DFORMAT_* nasm flags and use built-in preprocessor tests instead. git-svn-id: svn://svn.videolan.org/x264/trunk@448 df754926-b1dd-0310-bc7b-ec298dee348c commit 3723deea1601ba0dbda44ce09f77d6e1019226ac [revision 447] Author: Sam Hocevar Date: Wed Mar 1 22:12:22 2006 +0000 * common/i386/i386inc.asm: tell the ELF linker about our stack properties so that it does not assume the stack has to be executable. git-svn-id: svn://svn.videolan.org/x264/trunk@447 df754926-b1dd-0310-bc7b-ec298dee348c commit 70c140345c62ecb13ffa1af412f1e8f9f10567d2 [revision 446] Author: Loren Merritt Date: Tue Feb 28 19:49:00 2006 +0000 10l in r443 (p4x4 chroma) git-svn-id: svn://svn.videolan.org/x264/trunk@446 df754926-b1dd-0310-bc7b-ec298dee348c commit 3bd4ade21dcbf30287f3a350c25bda26ea667d22 [revision 445] Author: Loren Merritt Date: Mon Feb 27 07:31:36 2006 +0000 copy current macroblock to a smaller buffer, to improve cache coherency and reduce stride computations. part 3: asm git-svn-id: svn://svn.videolan.org/x264/trunk@445 df754926-b1dd-0310-bc7b-ec298dee348c commit 4ecb5f8ed11073fd4e6a4673a1275c430478aefc [revision 444] Author: Loren Merritt Date: Mon Feb 27 07:29:24 2006 +0000 copy current macroblock to a smaller buffer, to improve cache coherency and reduce stride computations. part 2: intra prediction git-svn-id: svn://svn.videolan.org/x264/trunk@444 df754926-b1dd-0310-bc7b-ec298dee348c commit 8f05dffc4574c40557a5a161b18c4e6037aeec48 [revision 443] Author: Loren Merritt Date: Mon Feb 27 07:28:20 2006 +0000 copy current macroblock to a smaller buffer, to improve cache coherency and reduce stride computations. part 1: memory arrangement. git-svn-id: svn://svn.videolan.org/x264/trunk@443 df754926-b1dd-0310-bc7b-ec298dee348c commit 388658234c05e9b282569be052a5977d6cc9e812 [revision 442] Author: Loren Merritt Date: Mon Feb 27 07:18:23 2006 +0000 h->mc.copy() git-svn-id: svn://svn.videolan.org/x264/trunk@442 df754926-b1dd-0310-bc7b-ec298dee348c commit 34cbb9170c3b9daeae91ef4aa2a48c2ec9bdfbc8 [revision 441] Author: Loren Merritt Date: Mon Feb 27 06:32:43 2006 +0000 lowres intra used wrong neighboring pixels git-svn-id: svn://svn.videolan.org/x264/trunk@441 df754926-b1dd-0310-bc7b-ec298dee348c commit bca81cae09973cd349382d8612ad6aaf412444b4 [revision 440] Author: Loren Merritt Date: Thu Feb 23 22:46:21 2006 +0000 trellis=2 slightly affected intra analysis even without subme=6 git-svn-id: svn://svn.videolan.org/x264/trunk@440 df754926-b1dd-0310-bc7b-ec298dee348c commit e5ed306c33a9b03d083660bb521758c76bdf36bd [revision 439] Author: Sam Hocevar Date: Thu Feb 16 22:00:46 2006 +0000 * encoder/ratecontrol.c: OS X support for exp2f and sqrtf. git-svn-id: svn://svn.videolan.org/x264/trunk@439 df754926-b1dd-0310-bc7b-ec298dee348c commit 14b26394bd35f6ab03c6d4b7424ddea893a5bfa1 [revision 438] Author: Loren Merritt Date: Thu Feb 16 01:32:56 2006 +0000 allow delta_qp > 26 git-svn-id: svn://svn.videolan.org/x264/trunk@438 df754926-b1dd-0310-bc7b-ec298dee348c commit 5fbca87deb6e3a79b9d3b6b31ea85fc79e49534f [revision 437] Author: Loren Merritt Date: Tue Feb 14 01:21:43 2006 +0000 ratecontrol didn't always account for header bits, causing an undersize in multipass with --ratetol inf. git-svn-id: svn://svn.videolan.org/x264/trunk@437 df754926-b1dd-0310-bc7b-ec298dee348c commit 12b778b8f4e8501cf06ba3513fc0e824d4a87ac1 [revision 436] Author: Loren Merritt Date: Mon Feb 13 17:36:22 2006 +0000 -q0 --b-rdo wasn't lossless git-svn-id: svn://svn.videolan.org/x264/trunk@436 df754926-b1dd-0310-bc7b-ec298dee348c commit 14f3cc06834e207b37222d93b7c6aea47b17524d [revision 435] Author: Loren Merritt Date: Mon Feb 13 04:34:15 2006 +0000 cosmetics git-svn-id: svn://svn.videolan.org/x264/trunk@435 df754926-b1dd-0310-bc7b-ec298dee348c commit 476a0f93c9aa42a39c3518f891e024ef41b1056e [revision 434] Author: Loren Merritt Date: Sun Feb 12 06:50:19 2006 +0000 allow ',' separator for --filter git-svn-id: svn://svn.videolan.org/x264/trunk@434 df754926-b1dd-0310-bc7b-ec298dee348c commit 8dfd87aeef4c1523d60684ca6c1368007a24aad4 [revision 433] Author: Loren Merritt Date: Sun Feb 12 06:28:22 2006 +0000 VfW: 10l in bime and refs git-svn-id: svn://svn.videolan.org/x264/trunk@433 df754926-b1dd-0310-bc7b-ec298dee348c commit d53108a30cd1b1284c59eb9e8bdfac157a3ddb37 [revision 432] Author: Loren Merritt Date: Sun Feb 12 01:36:21 2006 +0000 more lowres mv clipping fixes git-svn-id: svn://svn.videolan.org/x264/trunk@432 df754926-b1dd-0310-bc7b-ec298dee348c commit eb32d28463ab8433fba16851d4796d041b8de39f [revision 431] Author: Loren Merritt Date: Sat Feb 11 22:04:57 2006 +0000 VfW: cosmetics git-svn-id: svn://svn.videolan.org/x264/trunk@431 df754926-b1dd-0310-bc7b-ec298dee348c commit 60e848749dbcb8a44709675ea391f5c28a8a8c1f [revision 430] Author: Loren Merritt Date: Sat Feb 11 20:11:05 2006 +0000 VfW: support trellis, brdo, nr, bime. patch by Dan Nelson (dnelson at allantgroup dot com). git-svn-id: svn://svn.videolan.org/x264/trunk@430 df754926-b1dd-0310-bc7b-ec298dee348c commit 681b394485671f977a1a19d2279ace4c22eb0177 [revision 429] Author: Loren Merritt Date: Fri Feb 10 21:58:43 2006 +0000 amd64 mmx for some intra pred functions git-svn-id: svn://svn.videolan.org/x264/trunk@429 df754926-b1dd-0310-bc7b-ec298dee348c commit e1d852d2947dfaac201dfb7149070ed341caa64f [revision 428] Author: Loren Merritt Date: Fri Feb 10 20:52:48 2006 +0000 dequant_mmx made incorrect assumptions about extreme inputs. now uses 32bit in more cases. patch by Christian Heine. git-svn-id: svn://svn.videolan.org/x264/trunk@428 df754926-b1dd-0310-bc7b-ec298dee348c commit fed2847ca9c4b9f8240be78145681d12ea85e1e9 [revision 427] Author: Loren Merritt Date: Fri Feb 10 01:16:40 2006 +0000 lowres can reuse the normal mv cost table git-svn-id: svn://svn.videolan.org/x264/trunk@427 df754926-b1dd-0310-bc7b-ec298dee348c commit f959a749aec65753e77a0b5566adb18d6a9af87f [revision 426] Author: Loren Merritt Date: Thu Feb 9 04:51:55 2006 +0000 r422 broke x264_center_filter_mmxext git-svn-id: svn://svn.videolan.org/x264/trunk@426 df754926-b1dd-0310-bc7b-ec298dee348c commit 7125f9174c8d32f66bf9264c2b986dd1c03f4a27 [revision 425] Author: Sam Hocevar Date: Wed Feb 8 12:45:21 2006 +0000 * configure: define FORMAT_ELF under Linux and FORMAT_AOUTB under *BSD. git-svn-id: svn://svn.videolan.org/x264/trunk@425 df754926-b1dd-0310-bc7b-ec298dee348c commit 5d5c5cc213fb25a1fd151af80ab0a4fb614dd32c [revision 424] Author: Sam Hocevar Date: Wed Feb 8 11:07:06 2006 +0000 * common/i386/i386inc.asm: support for ELF, a.out and Mach-O objects. git-svn-id: svn://svn.videolan.org/x264/trunk@424 df754926-b1dd-0310-bc7b-ec298dee348c commit 78d31c22dcc1c42d6b2009b52ba758958dd1bff4 [revision 423] Author: Sam Hocevar Date: Wed Feb 8 09:32:03 2006 +0000 * configure: added a --enable-pic flag. git-svn-id: svn://svn.videolan.org/x264/trunk@423 df754926-b1dd-0310-bc7b-ec298dee348c commit dc454eab263d463d2eeecf627aae31a10a5d080c [revision 422] Author: Sam Hocevar Date: Wed Feb 8 09:26:56 2006 +0000 * Additional fixes to the PIC versions of assembly routines. They now pass all checkasm tests and output streams are bit-by-bit identical, which sounds good. git-svn-id: svn://svn.videolan.org/x264/trunk@422 df754926-b1dd-0310-bc7b-ec298dee348c commit ac9da5dbb4447c64bf9b82e849f4ae233c4413d3 [revision 421] Author: Sam Hocevar Date: Wed Feb 8 09:03:28 2006 +0000 * tools/checkasm.c: print the random seed used for the test, to allow for replays. It looks like dequant_4x4 fails 1 time out of 600, with the following seeds for instance: 1423 1957 2149 2455 3385 3403 3724 4095. git-svn-id: svn://svn.videolan.org/x264/trunk@421 df754926-b1dd-0310-bc7b-ec298dee348c commit bb21f3a920ffefe84a77933c060775b2089a9c6c [revision 420] Author: Loren Merritt Date: Wed Feb 8 00:53:35 2006 +0000 cosmetics in mc_chroma git-svn-id: svn://svn.videolan.org/x264/trunk@420 df754926-b1dd-0310-bc7b-ec298dee348c commit 80b669bbc73e92944954cadc612650ba08b80358 [revision 419] Author: Sam Hocevar Date: Tue Feb 7 19:05:47 2006 +0000 * Oh, so what I thought was unused code was in fact used. This fixes my breakage but makes the code rather slow in PIC mode. I will fix it later. git-svn-id: svn://svn.videolan.org/x264/trunk@419 df754926-b1dd-0310-bc7b-ec298dee348c commit eea893893af7ae49cd9cab333279f0323302db81 [revision 418] Author: Sam Hocevar Date: Tue Feb 7 17:40:56 2006 +0000 * Support for x86 position-independent code (PIC), needed for dynamic libs on Mac OS X Intel. I tried to make this as little intrusive as possible. git-svn-id: svn://svn.videolan.org/x264/trunk@418 df754926-b1dd-0310-bc7b-ec298dee348c commit 97f05071bb3aed659785ec10ccd6824020dfaef8 [revision 417] Author: Loren Merritt Date: Mon Feb 6 21:34:52 2006 +0000 msvc: #define isfinite() git-svn-id: svn://svn.videolan.org/x264/trunk@417 df754926-b1dd-0310-bc7b-ec298dee348c commit 19d07afabccda339852d7d49d3d5b11d538181f3 [revision 416] Author: Loren Merritt Date: Mon Feb 6 06:14:04 2006 +0000 x86 mmx for some intra pred functions git-svn-id: svn://svn.videolan.org/x264/trunk@416 df754926-b1dd-0310-bc7b-ec298dee348c commit d2dada763c6114b6245c6b468dfc2287123d12c5 [revision 415] Author: Loren Merritt Date: Mon Feb 6 05:53:44 2006 +0000 cosmetics: reorganize intra prediction dsp git-svn-id: svn://svn.videolan.org/x264/trunk@415 df754926-b1dd-0310-bc7b-ec298dee348c commit 791495e3d82b982ffa086593956bb96da45973e2 [revision 414] Author: Loren Merritt Date: Mon Feb 6 03:49:38 2006 +0000 too many systems don't have off_t; use uint64_t instead. git-svn-id: svn://svn.videolan.org/x264/trunk@414 df754926-b1dd-0310-bc7b-ec298dee348c commit ce237ab663525259ce64423341ce6893309cee88 [revision 413] Author: Loren Merritt Date: Sat Feb 4 05:39:02 2006 +0000 fix order of frame evaluation in pre-me git-svn-id: svn://svn.videolan.org/x264/trunk@413 df754926-b1dd-0310-bc7b-ec298dee348c commit f116707e12e1ba980c8cf6a091f3290ba4d75af4 [revision 412] Author: Loren Merritt Date: Fri Feb 3 18:23:26 2006 +0000 update AUTHORS git-svn-id: svn://svn.videolan.org/x264/trunk@412 df754926-b1dd-0310-bc7b-ec298dee348c commit 8b498e443950c642fd5f6c1406636f4ab5def27e [revision 411] Author: Loren Merritt Date: Thu Feb 2 04:37:05 2006 +0000 fix a check for NaN in ratecontrol git-svn-id: svn://svn.videolan.org/x264/trunk@411 df754926-b1dd-0310-bc7b-ec298dee348c commit e3b1f110b9eb2286e481febf29a015256a48c576 [revision 410] Author: Loren Merritt Date: Sun Jan 29 08:46:20 2006 +0000 fix mv predictors in pre-me for b-adapt. git-svn-id: svn://svn.videolan.org/x264/trunk@410 df754926-b1dd-0310-bc7b-ec298dee348c commit 6da9fc956cdafd6dde4c334568c18bd7bef292c1 [revision 409] Author: Loren Merritt Date: Mon Jan 23 02:02:50 2006 +0000 print --nr in sei params. tweak ratecontrol param checking. git-svn-id: svn://svn.videolan.org/x264/trunk@409 df754926-b1dd-0310-bc7b-ec298dee348c commit be6cce52d13fc6424e6244bfce03f67894c15d1e [revision 408] Author: Måns Rullgård Date: Thu Jan 19 00:05:42 2006 +0000 I've moved git-svn-id: svn://svn.videolan.org/x264/trunk@408 df754926-b1dd-0310-bc7b-ec298dee348c commit 273dc626b8b4f7cfe166be728038d9d3e9fd1fb7 [revision 407] Author: Måns Rullgård Date: Thu Jan 19 00:05:05 2006 +0000 write correct VUI timing info git-svn-id: svn://svn.videolan.org/x264/trunk@407 df754926-b1dd-0310-bc7b-ec298dee348c commit d125e4373da6f5e50d2e4ab73aab2d97732212c5 [revision 406] Author: Loren Merritt Date: Wed Jan 18 07:42:29 2006 +0000 early termination in UMH search git-svn-id: svn://svn.videolan.org/x264/trunk@406 df754926-b1dd-0310-bc7b-ec298dee348c commit a373f2aef22a54c8d597156429ff7f0ad41f1c9e [revision 405] Author: Loren Merritt Date: Wed Jan 18 07:08:01 2006 +0000 split mv_range enforcement from edge-of-frame clipping. fixes an occasional artifact with long mvs. git-svn-id: svn://svn.videolan.org/x264/trunk@405 df754926-b1dd-0310-bc7b-ec298dee348c commit 61b57afb7b614ab09a0508c4c53ca411f4f675f8 [revision 404] Author: Loren Merritt Date: Wed Jan 18 04:29:18 2006 +0000 cosmetics: suppress warning on unused variables git-svn-id: svn://svn.videolan.org/x264/trunk@404 df754926-b1dd-0310-bc7b-ec298dee348c commit 271c1947a599ccdc3a509260da2d5cd6699148d7 [revision 403] Author: Loren Merritt Date: Tue Jan 17 21:16:28 2006 +0000 cosmetics: simplify #includes git-svn-id: svn://svn.videolan.org/x264/trunk@403 df754926-b1dd-0310-bc7b-ec298dee348c commit 7fb7e61b6247d9571fe7244f03231adbcc1d7e75 [revision 402] Author: Sam Hocevar Date: Mon Jan 16 12:23:35 2006 +0000 * configure: NSLU2 platform support (why oh why) git-svn-id: svn://svn.videolan.org/x264/trunk@402 df754926-b1dd-0310-bc7b-ec298dee348c commit 1d82beb4c26d82d068e84265344e078854907c1e [revision 401] Author: Eric Petit Date: Sun Jan 15 22:29:15 2006 +0000 Re-enabled x86 optims on MacIntel, assume Nasm CVS is installed and -f macho -DPREFIX just seems to do the job git-svn-id: svn://svn.videolan.org/x264/trunk@401 df754926-b1dd-0310-bc7b-ec298dee348c commit 096c4eb70a5b3cc5aebda44e62b4d3dd83edbc9c [revision 400] Author: Eric Petit Date: Sat Jan 14 16:11:48 2006 +0000 Quick compile fix for OS X / Intel Optimizations are disabled at the moment. In order to get them to work, we'd need either nasm to be able to output Mach-O object files, or we should convert the assembly code to something OS X can handle, like gas. git-svn-id: svn://svn.videolan.org/x264/trunk@400 df754926-b1dd-0310-bc7b-ec298dee348c commit 743ad5971608e491bd1d70a31e9bfdc496301eef [revision 399] Author: Loren Merritt Date: Fri Jan 13 06:54:10 2006 +0000 cli: large file support git-svn-id: svn://svn.videolan.org/x264/trunk@399 df754926-b1dd-0310-bc7b-ec298dee348c commit 0abf15d29e904339fb9f606e83e01c9265e54b15 [revision 398] Author: Loren Merritt Date: Tue Jan 10 07:31:29 2006 +0000 dct-domain noise reduction (ported from lavc) git-svn-id: svn://svn.videolan.org/x264/trunk@398 df754926-b1dd-0310-bc7b-ec298dee348c commit 6bf39eaa780ef0877b7d6fe8497df9a38d4baa3d [revision 397] Author: Loren Merritt Date: Mon Jan 9 06:18:39 2006 +0000 early termination within large SADs. ~1% faster UMH, ~4% faster ESA. git-svn-id: svn://svn.videolan.org/x264/trunk@397 df754926-b1dd-0310-bc7b-ec298dee348c commit 73a45ef20dec4dc709e029f175eb20ae8eb099b9 [revision 396] Author: Loren Merritt Date: Thu Jan 5 19:11:38 2006 +0000 mkv: increase nalu size size to 4 bytes. patch by Haali. git-svn-id: svn://svn.videolan.org/x264/trunk@396 df754926-b1dd-0310-bc7b-ec298dee348c commit d3c2f10353e8409932f05be20c11f4eae09974c1 [revision 395] Author: Loren Merritt Date: Wed Jan 4 03:43:15 2006 +0000 less 64bit math: 12% faster trellis git-svn-id: svn://svn.videolan.org/x264/trunk@395 df754926-b1dd-0310-bc7b-ec298dee348c commit 28c0f2419db96278a14d126de3859a67d31d0a84 [revision 394] Author: Loren Merritt Date: Sun Jan 1 10:14:17 2006 +0000 more error checking of input parameters git-svn-id: svn://svn.videolan.org/x264/trunk@394 df754926-b1dd-0310-bc7b-ec298dee348c commit bf1e4d1faba2eff0f54029ccf4d98ce9ef09a757 [revision 393] Author: Loren Merritt Date: Sun Jan 1 09:39:29 2006 +0000 always write sps.vui git-svn-id: svn://svn.videolan.org/x264/trunk@393 df754926-b1dd-0310-bc7b-ec298dee348c commit 7091b47e853fb45ae2d9432ea7ffe085efa31936 [revision 392] Author: Loren Merritt Date: Sat Dec 31 14:16:02 2005 +0000 use some extra packing modes for CQM headers. fix typo in --cqm4p[yc]. git-svn-id: svn://svn.videolan.org/x264/trunk@392 df754926-b1dd-0310-bc7b-ec298dee348c commit a977f764240cb9139c2152448bb85dd89260639f [revision 391] Author: Loren Merritt Date: Fri Dec 30 08:26:42 2005 +0000 MSVC compatibility fixes git-svn-id: svn://svn.videolan.org/x264/trunk@391 df754926-b1dd-0310-bc7b-ec298dee348c commit 2fa8f84b6e108222735c2895b6419ed8c29ef031 [revision 390] Author: Loren Merritt Date: Fri Dec 30 04:56:49 2005 +0000 joint bidirectional motion refinement (--bime) git-svn-id: svn://svn.videolan.org/x264/trunk@390 df754926-b1dd-0310-bc7b-ec298dee348c commit 684d2d58a5e60bec5bd45834e1c87b4b150c4244 [revision 389] Author: Loren Merritt Date: Sat Dec 24 20:59:13 2005 +0000 fix some overflows in mp4 timestamps. patch by Francesco Corriga. git-svn-id: svn://svn.videolan.org/x264/trunk@389 df754926-b1dd-0310-bc7b-ec298dee348c commit 25b40141a3d6569bfdc58a94d3004a89211029d6 [revision 388] Author: Loren Merritt Date: Tue Dec 20 02:57:52 2005 +0000 Successive elimination motion search: same as exhaustive search, but 2-3x faster. git-svn-id: svn://svn.videolan.org/x264/trunk@388 df754926-b1dd-0310-bc7b-ec298dee348c commit a9607af8a776bb00aa463fa926fb4e4661eff1e4 [revision 387] Author: Eric Petit Date: Tue Dec 13 16:32:39 2005 +0000 Fixed cc_check on OS X (gcc -o /dev/null always fails) git-svn-id: svn://svn.videolan.org/x264/trunk@387 df754926-b1dd-0310-bc7b-ec298dee348c commit b914f8081539c243a7a3f5a15a11145e06466da9 [revision 386] Author: Loren Merritt Date: Tue Dec 13 11:24:02 2005 +0000 postpone pskip decision until after p16x16ref0 motion search. reduces the number of erroneous pskips in low-detail regions. git-svn-id: svn://svn.videolan.org/x264/trunk@386 df754926-b1dd-0310-bc7b-ec298dee348c commit cc3308878925bf33c0e2707c9177dd345ed238a5 [revision 385] Author: Loren Merritt Date: Wed Dec 7 17:29:20 2005 +0000 configure: autodetect gpac, avis, pthread, vfw git-svn-id: svn://svn.videolan.org/x264/trunk@385 df754926-b1dd-0310-bc7b-ec298dee348c commit 38fcbfbeb53d402f9431f18709aee37987dcf318 [revision 384] Author: Loren Merritt Date: Mon Dec 5 12:46:46 2005 +0000 --no-fast-pskip patch by Alex Wright. git-svn-id: svn://svn.videolan.org/x264/trunk@384 df754926-b1dd-0310-bc7b-ec298dee348c commit 5ce628fa0413b7d87e87619a65a9e1cabe5cd5be [revision 383] Author: Loren Merritt Date: Mon Dec 5 12:38:46 2005 +0000 cosmetics: config.h is now modified only by configure. make now calls configure if you haven't. git-svn-id: svn://svn.videolan.org/x264/trunk@383 df754926-b1dd-0310-bc7b-ec298dee348c commit f03dbfd42d110e982323f78aac131024e0687590 [revision 382] Author: Loren Merritt Date: Sun Dec 4 21:19:17 2005 +0000 MP4: set "track enabled" flag. patch by Robert Swain. git-svn-id: svn://svn.videolan.org/x264/trunk@382 df754926-b1dd-0310-bc7b-ec298dee348c commit 8443f260777556ebd6132f6448d406c769194e23 [revision 381] Author: Loren Merritt Date: Sat Dec 3 01:50:52 2005 +0000 faster subpel motion search. patch by Alex Wright. git-svn-id: svn://svn.videolan.org/x264/trunk@381 df754926-b1dd-0310-bc7b-ec298dee348c commit 8f0d66cc0973cfb8360fad55b22248fe620def34 [revision 380] Author: Loren Merritt Date: Mon Nov 28 07:43:22 2005 +0000 don't use gnu extensions to grep and sed. git-svn-id: svn://svn.videolan.org/x264/trunk@380 df754926-b1dd-0310-bc7b-ec298dee348c commit 6826cf2faf2692b0bb37780148a89b0e58826f6b [revision 379] Author: Loren Merritt Date: Mon Nov 28 02:03:12 2005 +0000 pkg-config: major.minor.patch version git-svn-id: svn://svn.videolan.org/x264/trunk@379 df754926-b1dd-0310-bc7b-ec298dee348c commit 8e44d938b225c0a4dabad257b471335fdd0fe18d [revision 378] Author: Loren Merritt Date: Mon Nov 28 00:29:10 2005 +0000 `make fprofiled` to automate gcc -fprofile-generate/use git-svn-id: svn://svn.videolan.org/x264/trunk@378 df754926-b1dd-0310-bc7b-ec298dee348c commit 71b75efe735e76d8d6ec4b51cd09b477dc0908cc [revision 377] Author: Loren Merritt Date: Sun Nov 27 23:24:43 2005 +0000 10l git-svn-id: svn://svn.videolan.org/x264/trunk@377 df754926-b1dd-0310-bc7b-ec298dee348c commit bdddcf97476ae25a8bd80339090c399b59b8c2f3 [revision 376] Author: Loren Merritt Date: Sun Nov 27 23:23:49 2005 +0000 param.b_repeat_headers (not yet used) git-svn-id: svn://svn.videolan.org/x264/trunk@376 df754926-b1dd-0310-bc7b-ec298dee348c commit 8c7611c14281f5d597262aa66771f0b9b50366a8 [revision 375] Author: Loren Merritt Date: Tue Nov 22 19:27:54 2005 +0000 support pkg-config. patch by Caro. git-svn-id: svn://svn.videolan.org/x264/trunk@375 df754926-b1dd-0310-bc7b-ec298dee348c commit 78d2f605d0293484d50d58b74489700b65cc0472 [revision 374] Author: Loren Merritt Date: Tue Nov 22 06:59:48 2005 +0000 write encoding options to the userdata SEI and to the 2pass statsfile. check for incompatible options in the 2nd pass. git-svn-id: svn://svn.videolan.org/x264/trunk@374 df754926-b1dd-0310-bc7b-ec298dee348c commit c010ba1dde8fb861417e30c0d4316c6cb33064dd [revision 373] Author: Loren Merritt Date: Tue Nov 22 05:02:33 2005 +0000 change default level to "5.1" git-svn-id: svn://svn.videolan.org/x264/trunk@373 df754926-b1dd-0310-bc7b-ec298dee348c commit 05e6cf0516ff5646c841fa66d96be5e0264b0daa [revision 372] Author: Loren Merritt Date: Tue Nov 22 02:53:53 2005 +0000 skip dequant+idct of decimated blocks. git-svn-id: svn://svn.videolan.org/x264/trunk@372 df754926-b1dd-0310-bc7b-ec298dee348c commit bc478923aefe1e4aa5e0201b2214f1ed8ad8f719 [revision 371] Author: Loren Merritt Date: Tue Nov 22 02:48:39 2005 +0000 after a 1pass ABR, print the value of --crf which would result in the same bitrate. git-svn-id: svn://svn.videolan.org/x264/trunk@371 df754926-b1dd-0310-bc7b-ec298dee348c commit 528cbd1f16cd20b3ee8bbfd5b5edf6634a7f4634 [revision 370] Author: Loren Merritt Date: Tue Nov 22 02:43:11 2005 +0000 subpel search: always check mvp. git-svn-id: svn://svn.videolan.org/x264/trunk@370 df754926-b1dd-0310-bc7b-ec298dee348c commit 429e0603017630feb239e22de3eb279ee02932c9 [revision 369] Author: Loren Merritt Date: Tue Nov 22 02:36:29 2005 +0000 faster b-rdo (skip RD of modes with bad SATD). patch by Alex Wright. git-svn-id: svn://svn.videolan.org/x264/trunk@369 df754926-b1dd-0310-bc7b-ec298dee348c commit 6fe92323b2e007e3a31714fb5b090b732fc24e62 [revision 368] Author: Loren Merritt Date: Fri Nov 18 11:20:07 2005 +0000 RD mode decision for B-frames (--b-rdo) patch by Alex Wright. git-svn-id: svn://svn.videolan.org/x264/trunk@368 df754926-b1dd-0310-bc7b-ec298dee348c commit 322c42ad8ca9ac3d4755e62fadb29c3ef7a4ecf5 [revision 367] Author: Sam Hocevar Date: Fri Nov 11 23:57:18 2005 +0000 * common/amd64/quant-a.asm: added missing GLOBAL flags that prevented PIC builds, thanks to Anssi Hannula. git-svn-id: svn://svn.videolan.org/x264/trunk@367 df754926-b1dd-0310-bc7b-ec298dee348c commit ffd008ebdd6ebbf6f83dbf08315f3765a072261a [revision 366] Author: Sam Hocevar Date: Fri Nov 11 17:46:24 2005 +0000 * configure: added the Alpha platform. git-svn-id: svn://svn.videolan.org/x264/trunk@366 df754926-b1dd-0310-bc7b-ec298dee348c commit 9187a8f84ba24b2825487971ce94db404303393d [revision 365] Author: Loren Merritt Date: Mon Nov 7 07:53:42 2005 +0000 use array_non_zero() when we don't need a full array_non_zero_count() git-svn-id: svn://svn.videolan.org/x264/trunk@365 df754926-b1dd-0310-bc7b-ec298dee348c commit d18bbd3b2e28958b9e153b62033a7f66f6fea0ec [revision 364] Author: Loren Merritt Date: Sun Nov 6 07:07:30 2005 +0000 mmx dequant. up to 3% speedup w/ RD. git-svn-id: svn://svn.videolan.org/x264/trunk@364 df754926-b1dd-0310-bc7b-ec298dee348c commit d447c2d3db71e0b422ed9330ac26410ba9f90622 [revision 363] Author: Loren Merritt Date: Sun Nov 6 00:26:43 2005 +0000 allow --level to understand names in addition to idc git-svn-id: svn://svn.videolan.org/x264/trunk@363 df754926-b1dd-0310-bc7b-ec298dee348c commit 87e5994706c76bd628c7e23f0dca95f05e922a7c [revision 362] Author: Loren Merritt Date: Fri Nov 4 11:39:58 2005 +0000 check (most of) the levels constaints. set default max_mv_range based on level_idc. git-svn-id: svn://svn.videolan.org/x264/trunk@362 df754926-b1dd-0310-bc7b-ec298dee348c commit 1e80b69b3717e019f6cbab071582f9812b85fa4d [revision 361] Author: Loren Merritt Date: Thu Nov 3 22:57:52 2005 +0000 if p16x16 RD decides to code a MB as p_skip, then don't check smaller partitions. git-svn-id: svn://svn.videolan.org/x264/trunk@361 df754926-b1dd-0310-bc7b-ec298dee348c commit 5c43fb3b66b5ccf4ae0c4bd63599bf3f64d4557e [revision 360] Author: Loren Merritt Date: Thu Nov 3 22:20:47 2005 +0000 Trellis RD quantization. around +.2 dB git-svn-id: svn://svn.videolan.org/x264/trunk@360 df754926-b1dd-0310-bc7b-ec298dee348c commit 3f1ed7cee623b64fd66fc60db62275df23177966 [revision 359] Author: Loren Merritt Date: Thu Nov 3 20:16:56 2005 +0000 cosmetics: XCHG macro git-svn-id: svn://svn.videolan.org/x264/trunk@359 df754926-b1dd-0310-bc7b-ec298dee348c commit 662e56b59eee5ddc15e5fb8c53c7cd49bcc39eeb [revision 358] Author: Loren Merritt Date: Thu Nov 3 11:27:24 2005 +0000 skip a few duplicate candidates in qpel search. git-svn-id: svn://svn.videolan.org/x264/trunk@358 df754926-b1dd-0310-bc7b-ec298dee348c commit 2107a4f7204f0a764830b562e86d50f2b979a0b8 [revision 357] Author: Loren Merritt Date: Thu Nov 3 11:26:17 2005 +0000 skip a few duplicate candidates in fullpel hex&umh search. git-svn-id: svn://svn.videolan.org/x264/trunk@357 df754926-b1dd-0310-bc7b-ec298dee348c commit 01c05a79022c2349ddba2f3e101f8a1d26500906 [revision 356] Author: Loren Merritt Date: Thu Nov 3 06:53:59 2005 +0000 cli: arithmetic overflow in bitrate printing git-svn-id: svn://svn.videolan.org/x264/trunk@356 df754926-b1dd-0310-bc7b-ec298dee348c commit db67b818250aa75680df5ff15ff58418e850d321 [revision 355] Author: Loren Merritt Date: Thu Nov 3 06:47:19 2005 +0000 cosmetics in x264_cabac_mb_type git-svn-id: svn://svn.videolan.org/x264/trunk@355 df754926-b1dd-0310-bc7b-ec298dee348c commit 89d2c6a13fc2c864191af2ad86d07dee69a6c75b [revision 354] Author: Loren Merritt Date: Thu Nov 3 06:40:46 2005 +0000 X264_ABS => abs git-svn-id: svn://svn.videolan.org/x264/trunk@354 df754926-b1dd-0310-bc7b-ec298dee348c commit d13a18680572b8ae1075f9a2d53bf57b51eab6ec [revision 353] Author: Loren Merritt Date: Thu Nov 3 02:42:48 2005 +0000 amd64 sse2 8x8dct. 1.45x faster than mmx. git-svn-id: svn://svn.videolan.org/x264/trunk@353 df754926-b1dd-0310-bc7b-ec298dee348c commit 08e19ed8f28e5bb1fdd951eb2bab04c0248f9af1 [revision 352] Author: Loren Merritt Date: Tue Nov 1 03:34:48 2005 +0000 allow 1pass ratecontrol with keyint=1 git-svn-id: svn://svn.videolan.org/x264/trunk@352 df754926-b1dd-0310-bc7b-ec298dee348c commit 9617e25c2ea08d029decb106fa7cf51a13a03706 [revision 351] Author: Loren Merritt Date: Mon Oct 31 04:02:15 2005 +0000 cli: print estimated time left in --progress git-svn-id: svn://svn.videolan.org/x264/trunk@351 df754926-b1dd-0310-bc7b-ec298dee348c commit d484bce60eb3405b2d1bc666a61120dea6bbe294 [revision 350] Author: Loren Merritt Date: Mon Oct 31 02:52:33 2005 +0000 doc/ratecontrol.txt git-svn-id: svn://svn.videolan.org/x264/trunk@350 df754926-b1dd-0310-bc7b-ec298dee348c commit 5aced82614a4be3106eec04ba983d122e9e7f668 [revision 349] Author: Loren Merritt Date: Mon Oct 31 02:50:36 2005 +0000 rm doc/dct.txt git-svn-id: svn://svn.videolan.org/x264/trunk@349 df754926-b1dd-0310-bc7b-ec298dee348c commit b179e4740f7624ea1be4db0682a658fe6822a9e8 [revision 348] Author: Loren Merritt Date: Sun Oct 30 23:42:24 2005 +0000 in constant QP mode, write that QP in the PPS to save a few bits in each slice header. git-svn-id: svn://svn.videolan.org/x264/trunk@348 df754926-b1dd-0310-bc7b-ec298dee348c commit 108f197cb62a9f29b0b671e2eceafd8ccc4ded21 [revision 347] Author: Loren Merritt Date: Sun Oct 30 06:22:29 2005 +0000 faster decimation git-svn-id: svn://svn.videolan.org/x264/trunk@347 df754926-b1dd-0310-bc7b-ec298dee348c commit fa01979f7260543c845d0823d4a7c0774bcf5a16 [revision 346] Author: Loren Merritt Date: Sat Oct 29 04:05:57 2005 +0000 cosmetics: fix an erroneous warning from r340. git-svn-id: svn://svn.videolan.org/x264/trunk@346 df754926-b1dd-0310-bc7b-ec298dee348c commit 8d857c5601be89dc32d995c519c096805249f77f [revision 345] Author: Loren Merritt Date: Sat Oct 29 03:00:50 2005 +0000 cosmetics: change literal cabac_block_cat to an enum. git-svn-id: svn://svn.videolan.org/x264/trunk@345 df754926-b1dd-0310-bc7b-ec298dee348c commit c636f90355a1855b0b2576d79d34541c063daee5 [revision 344] Author: Loren Merritt Date: Sat Oct 29 02:21:39 2005 +0000 cabac: merge i_state with i_mps. bs_write multiple bits at once. git-svn-id: svn://svn.videolan.org/x264/trunk@344 df754926-b1dd-0310-bc7b-ec298dee348c commit 5167ebb2bcfdd525d47abc91329d3588feab0b5f [revision 343] Author: Loren Merritt Date: Sat Oct 29 01:43:29 2005 +0000 remove unused adaptive cabac_idc code git-svn-id: svn://svn.videolan.org/x264/trunk@343 df754926-b1dd-0310-bc7b-ec298dee348c commit 817ef1468a80b20a76b4c12af44e3b85339880d5 [revision 342] Author: Eric Petit Date: Thu Oct 27 10:27:04 2005 +0000 Fixed compilation on PPC (spotted by David Wolstencroft) git-svn-id: svn://svn.videolan.org/x264/trunk@342 df754926-b1dd-0310-bc7b-ec298dee348c commit 109ae085288c0068e2f40bfffd41070bd25dfa8b [revision 341] Author: Loren Merritt Date: Wed Oct 26 08:38:11 2005 +0000 mmx deblocking. 2.5x faster deblocking functions, 1-4% overall. git-svn-id: svn://svn.videolan.org/x264/trunk@341 df754926-b1dd-0310-bc7b-ec298dee348c commit 166601503800e00a33d88eb488da744a486ecb77 [revision 340] Author: Loren Merritt Date: Wed Oct 26 07:04:59 2005 +0000 If frame count is known at init time (cli & vfw), then abort if the 2nd pass exceeds the length of the 1st pass. If it's not known (mencoder), then report a non-fatal error when we run off the end of the 1st pass stats, and switch to constant QP. git-svn-id: svn://svn.videolan.org/x264/trunk@340 df754926-b1dd-0310-bc7b-ec298dee348c commit 75832019417943ed6a68b99bd75f5ef7efe1d998 [revision 339] Author: Loren Merritt Date: Wed Oct 26 06:40:51 2005 +0000 move checkasm to tools/ delete unused stuff in testing/ `make clean` deletes checkasm and avc2avi git-svn-id: svn://svn.videolan.org/x264/trunk@339 df754926-b1dd-0310-bc7b-ec298dee348c commit 6347263823e0fce26593fe36d812ba95931ebcb0 [revision 338] Author: Loren Merritt Date: Wed Oct 26 06:31:35 2005 +0000 checkasm: check 8x8dct, mc average, quant, and SSE2. git-svn-id: svn://svn.videolan.org/x264/trunk@338 df754926-b1dd-0310-bc7b-ec298dee348c commit 57900a1b0caa43372433b7bca25b26d764fadaff [revision 337] Author: Loren Merritt Date: Wed Oct 26 06:30:19 2005 +0000 r336 broke amd64 x264_pixel_sad_16x16_sse2 (though it's not being used) git-svn-id: svn://svn.videolan.org/x264/trunk@337 df754926-b1dd-0310-bc7b-ec298dee348c commit 360eb55eda428cba8d6d4e411ff87e0d5dedbf05 [revision 336] Author: Loren Merritt Date: Tue Oct 25 10:57:29 2005 +0000 Windows 64bit asm. patch by squid_80. git-svn-id: svn://svn.videolan.org/x264/trunk@336 df754926-b1dd-0310-bc7b-ec298dee348c commit 6d969739baac6b9f7e9bcb44c3b7dbc21890dd1b [revision 335] Author: Loren Merritt Date: Mon Oct 24 16:15:11 2005 +0000 delete build/cygwin because it's handled in the main configure/makefile. git-svn-id: svn://svn.videolan.org/x264/trunk@335 df754926-b1dd-0310-bc7b-ec298dee348c commit 0ddc9d5524a48882ac804948775fd7a35b3a07da [revision 334] Author: Loren Merritt Date: Sun Oct 23 09:52:34 2005 +0000 --crf: 1pass quality-based VBR. git-svn-id: svn://svn.videolan.org/x264/trunk@334 df754926-b1dd-0310-bc7b-ec298dee348c commit 06f1dafd17e9ebb1cd9d271fd72eb0c04e2337bc [revision 333] Author: Eric Petit Date: Sun Oct 16 09:53:05 2005 +0000 Added --enable-gprof (patch by Johannes Reinhardt) git-svn-id: svn://svn.videolan.org/x264/trunk@333 df754926-b1dd-0310-bc7b-ec298dee348c commit 2ac5fe040b35546a7d7bc0b463fd4a9cb268ff3b [revision 332] Author: Loren Merritt Date: Sun Oct 16 05:44:50 2005 +0000 cosmetics: remove #if0'ed code patch by Robert Swain. git-svn-id: svn://svn.videolan.org/x264/trunk@332 df754926-b1dd-0310-bc7b-ec298dee348c commit 1647e6d6e147b3e2072b4f36b1ed27df0715ff0d [revision 331] Author: Loren Merritt Date: Sun Oct 16 01:47:30 2005 +0000 faster bs_write git-svn-id: svn://svn.videolan.org/x264/trunk@331 df754926-b1dd-0310-bc7b-ec298dee348c commit b659ca6f53df6f7b1b423112ef0f95e7eb166ef5 [revision 330] Author: Loren Merritt Date: Sat Oct 15 04:11:06 2005 +0000 during RDO, skip the bitstream writing and just calculate the number of bits that would be used. speedup: cabac +4-8%, cavlc +2-4%. git-svn-id: svn://svn.videolan.org/x264/trunk@330 df754926-b1dd-0310-bc7b-ec298dee348c commit 48c2e935e3638a38c988b11204ff52a85bf48fc9 [revision 329] Author: Loren Merritt Date: Sat Oct 15 00:27:17 2005 +0000 Use SAD instead of SATD for halfpel motion search. Move multiref termination after halfpel search. Total: 3-7% speedup and +/-.02 dB. patch by Alex Wright. git-svn-id: svn://svn.videolan.org/x264/trunk@329 df754926-b1dd-0310-bc7b-ec298dee348c commit a8ac858b06ddca09acd98c35456b1008412cbe94 [revision 328] Author: Loren Merritt Date: Thu Oct 13 18:19:38 2005 +0000 VfW: mixed refs. patch by celtic_druid. git-svn-id: svn://svn.videolan.org/x264/trunk@328 df754926-b1dd-0310-bc7b-ec298dee348c commit d69837d312aa09c020416008c26f7008783d8c7f [revision 327] Author: Loren Merritt Date: Mon Oct 10 22:51:26 2005 +0000 allow non-mod16 resolutions git-svn-id: svn://svn.videolan.org/x264/trunk@327 df754926-b1dd-0310-bc7b-ec298dee348c commit 67f2a4791ca35a019dd645818c2c95f2b88c936e [revision 326] Author: Loren Merritt Date: Mon Oct 10 01:29:17 2005 +0000 VfW: prevent duplicate free() in compress_end() git-svn-id: svn://svn.videolan.org/x264/trunk@326 df754926-b1dd-0310-bc7b-ec298dee348c commit 0bde6ae12b9eda914fa51da95bef8beae09ea8f0 [revision 325] Author: Loren Merritt Date: Mon Oct 10 00:32:45 2005 +0000 cosmetics: remove declarations of nonexistent asm functions git-svn-id: svn://svn.videolan.org/x264/trunk@325 df754926-b1dd-0310-bc7b-ec298dee348c commit 015ac5865c81ee94125493aca28d0ccbc0f639b4 [revision 324] Author: Loren Merritt Date: Sun Oct 9 21:55:53 2005 +0000 cosmetics (whitespace) in VfW git-svn-id: svn://svn.videolan.org/x264/trunk@324 df754926-b1dd-0310-bc7b-ec298dee348c commit 54d413b9ad22244599489a0c50e99fafa07b89a1 [revision 323] Author: Loren Merritt Date: Sun Oct 9 21:37:56 2005 +0000 VfW: some reorganization patch by Francesco Corriga. git-svn-id: svn://svn.videolan.org/x264/trunk@323 df754926-b1dd-0310-bc7b-ec298dee348c commit a75462ead66beb222aae1efe1958848c26dc4be6 [revision 322] Author: Loren Merritt Date: Sun Oct 9 06:10:03 2005 +0000 cosmetics: merge some duplicate tables git-svn-id: svn://svn.videolan.org/x264/trunk@322 df754926-b1dd-0310-bc7b-ec298dee348c commit 1c6ccbf543ac7725e46f94bdb24fa6784d315962 [revision 321] Author: Loren Merritt Date: Sun Oct 9 03:32:56 2005 +0000 remove cabac byte-stuffing code, because it just wastes bits in lossless, and does nothing at all at sane bitrates. git-svn-id: svn://svn.videolan.org/x264/trunk@321 df754926-b1dd-0310-bc7b-ec298dee348c commit acee2d5168a39f301b7cda1d4effe943e321e1f8 [revision 320] Author: Loren Merritt Date: Sun Oct 9 00:14:30 2005 +0000 don't allocate lowres planes if they won't be used (i.e. in the 2nd pass). git-svn-id: svn://svn.videolan.org/x264/trunk@320 df754926-b1dd-0310-bc7b-ec298dee348c commit 938c52d2a7285c5872eea2f5d165a1b26699b349 [revision 319] Author: Loren Merritt Date: Sat Oct 8 21:17:44 2005 +0000 cosmetics: move some stuff from macroblock_encode to cache_save git-svn-id: svn://svn.videolan.org/x264/trunk@319 df754926-b1dd-0310-bc7b-ec298dee348c commit a0012bf38d366b1b97e571fe27c665139f3c631c [revision 318] Author: Loren Merritt Date: Sat Oct 8 06:49:29 2005 +0000 new option: --mixed-refs Allows each 8x8 or 16x8 partition to independently select a reference frame, as opposed to only one ref per macroblock. patch mostly by Alex Wright (alexw0885 at hotmail dot com). git-svn-id: svn://svn.videolan.org/x264/trunk@318 df754926-b1dd-0310-bc7b-ec298dee348c commit 68592115c77b8fcd091b32f2d39d8e129a95bbef [revision 317] Author: Loren Merritt Date: Sat Oct 8 04:45:51 2005 +0000 cosmetics in option parsing git-svn-id: svn://svn.videolan.org/x264/trunk@317 df754926-b1dd-0310-bc7b-ec298dee348c commit 4b925a1cfbdd6613449b70283cd6f80adbeb1f27 [revision 316] Author: Loren Merritt Date: Sat Oct 8 03:52:10 2005 +0000 expose the rest of the VUI flags. patch by Christian Heine. git-svn-id: svn://svn.videolan.org/x264/trunk@316 df754926-b1dd-0310-bc7b-ec298dee348c commit aebad793a69d175b139da28aafff6dbfec81d7c1 [revision 315] Author: Sam Hocevar Date: Tue Oct 4 12:08:33 2005 +0000 * common/amd64/mc-a.asm: use RIP-relative addressing in PIC mode. git-svn-id: svn://svn.videolan.org/x264/trunk@315 df754926-b1dd-0310-bc7b-ec298dee348c commit db80497dd2e8bb0cd02c45d73ca74294b0671b61 [revision 314] Author: Loren Merritt Date: Tue Oct 4 07:12:21 2005 +0000 temporal predictors for 16x16 motion search. git-svn-id: svn://svn.videolan.org/x264/trunk@314 df754926-b1dd-0310-bc7b-ec298dee348c commit 7e165477fb69bc107e3fcfdac3e2cb53541870f6 [revision 313] Author: Loren Merritt Date: Sun Oct 2 22:07:12 2005 +0000 slightly faster/cleaner block_residual_write_cabac git-svn-id: svn://svn.videolan.org/x264/trunk@313 df754926-b1dd-0310-bc7b-ec298dee348c commit 4c8ccfe6de4a44cd46bcaf1fc17ae90bfe34d958 [revision 312] Author: Loren Merritt Date: Sun Oct 2 20:12:46 2005 +0000 cosmetics git-svn-id: svn://svn.videolan.org/x264/trunk@312 df754926-b1dd-0310-bc7b-ec298dee348c commit ce3a422466b4df055f5b67116483eee20676939c [revision 311] Author: Loren Merritt Date: Sun Oct 2 05:50:35 2005 +0000 cli: fix a crash on piped input. git-svn-id: svn://svn.videolan.org/x264/trunk@311 df754926-b1dd-0310-bc7b-ec298dee348c commit cb88eb7bf7756e25123cdfffdbdc49abc169ef33 [revision 310] Author: Loren Merritt Date: Sun Oct 2 05:01:39 2005 +0000 stats summary: separately report all 5 partition sizes, and add ref usages git-svn-id: svn://svn.videolan.org/x264/trunk@310 df754926-b1dd-0310-bc7b-ec298dee348c commit bab1d61dd306199747dd8f949bde2a49b20c6f70 [revision 309] Author: Loren Merritt Date: Sun Oct 2 04:03:06 2005 +0000 disposable frames shouldn't get their own coded_frame_num. git-svn-id: svn://svn.videolan.org/x264/trunk@309 df754926-b1dd-0310-bc7b-ec298dee348c commit 31a36aa8621c7bd0264b87421e8d0a490d7c45f5 [revision 308] Author: Loren Merritt Date: Sat Oct 1 19:33:26 2005 +0000 typo in ia32 x264_pixel_avg_weight_w8_mmxext git-svn-id: svn://svn.videolan.org/x264/trunk@308 df754926-b1dd-0310-bc7b-ec298dee348c commit 458e63cadb0c6295273fd85def3aca0098a309e3 [revision 307] Author: Loren Merritt Date: Sat Oct 1 06:48:13 2005 +0000 mmx avg (already existed by not used for bipred) mmx biweighted avg (3x faster than C) git-svn-id: svn://svn.videolan.org/x264/trunk@307 df754926-b1dd-0310-bc7b-ec298dee348c commit 3112619429c0cf781817867f0d124c882740d66f [revision 306] Author: Loren Merritt Date: Sat Oct 1 04:43:31 2005 +0000 cosmetics: move avg function ptrs from pixf to mc. git-svn-id: svn://svn.videolan.org/x264/trunk@306 df754926-b1dd-0310-bc7b-ec298dee348c commit 82d5e6faa6aa8ca8888481019513782ef9701240 [revision 305] Author: Loren Merritt Date: Tue Sep 27 19:59:09 2005 +0000 with B-pyramid, forget old refs in POC order instead of coded order. (before, b_skip was unavailable with pyramid and ref=1) git-svn-id: svn://svn.videolan.org/x264/trunk@305 df754926-b1dd-0310-bc7b-ec298dee348c commit 4223e3874f7268d9ea36f32a2150c3a123881f4b [revision 304] Author: Loren Merritt Date: Mon Sep 26 03:00:10 2005 +0000 typo in r296. patch by lurui. git-svn-id: svn://svn.videolan.org/x264/trunk@304 df754926-b1dd-0310-bc7b-ec298dee348c commit 2a3417b0fe4a806480e49a6dd13ab8d625b03466 [revision 303] Author: Sam Hocevar Date: Sun Sep 25 22:12:56 2005 +0000 * common/amd64/*.asm: use RIP-related addressing in PIC mode. git-svn-id: svn://svn.videolan.org/x264/trunk@303 df754926-b1dd-0310-bc7b-ec298dee348c commit 77997bffd6fdb3727c31f787767aafa11bc62266 [revision 302] Author: Sam Hocevar Date: Sun Sep 25 19:52:57 2005 +0000 * common/amd64/mc-a.asm: removed useless global variables git-svn-id: svn://svn.videolan.org/x264/trunk@302 df754926-b1dd-0310-bc7b-ec298dee348c commit b2e9af98bf5f44363f3877baf7bfa6cce4d64805 [revision 301] Author: Sam Hocevar Date: Sun Sep 25 13:52:58 2005 +0000 * configure: support extra $(ASFLAGS) through --extra-asflags. git-svn-id: svn://svn.videolan.org/x264/trunk@301 df754926-b1dd-0310-bc7b-ec298dee348c commit 1b16298a215393dc741fabb0e7212c0b0ee53846 [revision 300] Author: Loren Merritt Date: Sat Sep 24 19:41:50 2005 +0000 reorganized VfW UI. patch by Antony Boucher, graphic by Jarod. git-svn-id: svn://svn.videolan.org/x264/trunk@300 df754926-b1dd-0310-bc7b-ec298dee348c commit 35f641710900a39ea208860befc9cfe35043f7cd [revision 299] Author: Loren Merritt Date: Sat Sep 24 18:54:49 2005 +0000 MP4 output: update to GPAC 0.4 API. patch mostly by Robert Swain. git-svn-id: svn://svn.videolan.org/x264/trunk@299 df754926-b1dd-0310-bc7b-ec298dee348c commit cfebeac1a475f4a2ee57e5dd3cd1ff0c560f38db [revision 298] Author: Loren Merritt Date: Sat Sep 24 18:22:02 2005 +0000 faster mmx quant 15bit, and add 16bit version. total speedup: ~0.3% patch by Christian Heine. git-svn-id: svn://svn.videolan.org/x264/trunk@298 df754926-b1dd-0310-bc7b-ec298dee348c commit 49ac5e2f921ef940701e31ca7e6246e44480783b [revision 297] Author: Loren Merritt Date: Sat Sep 24 17:04:21 2005 +0000 faster mmx satd. *x16: 20%, *x8: 10%, total: 2-4%. ia32 patch by Christian Heine, amd64 port by me. git-svn-id: svn://svn.videolan.org/x264/trunk@297 df754926-b1dd-0310-bc7b-ec298dee348c commit 76192dcb1cc7720a1e633ba6b0fbdb2fbacbe9bb [revision 296] Author: Loren Merritt Date: Sat Sep 24 16:58:36 2005 +0000 allow i4x4 and i8x8 down-left prediction with emulated top-right samples. based on a patch by Johannes Reinhardt (Johannes dot Reinhardt at uni-konstanz dot de) git-svn-id: svn://svn.videolan.org/x264/trunk@296 df754926-b1dd-0310-bc7b-ec298dee348c commit 690a02b1c9132bfecc88068de757e6b0e5ef7b84 [revision 295] Author: Steve Lhomme Date: Tue Sep 20 16:18:23 2005 +0000 fps patch by Haali git-svn-id: svn://svn.videolan.org/x264/trunk@295 df754926-b1dd-0310-bc7b-ec298dee348c commit 8a5de70e926c334bcf422d21e8995b7be6ecf496 [revision 294] Author: Sam Hocevar Date: Tue Sep 20 15:50:41 2005 +0000 * configure: added support for ia64, mips/mipsel, m68k, arm, s390 and hppa platforms, as well as linux sparc. git-svn-id: svn://svn.videolan.org/x264/trunk@294 df754926-b1dd-0310-bc7b-ec298dee348c commit c4ffed4986fe4706c0c5c2b514ce95668f0b8393 [revision 293] Author: Loren Merritt Date: Wed Sep 14 17:20:17 2005 +0000 MMX quantization functions, and optimization of the C versions. about 3x faster quant_8x8, quant_4x4, quant_4x4_dc, and quant_2x2_dc. total speedup: 4-10%. patch by Alexander Izvorski and Christian Heine. git-svn-id: svn://svn.videolan.org/x264/trunk@293 df754926-b1dd-0310-bc7b-ec298dee348c commit 16f423a00ee91f692d441a31fa99394543995582 [revision 292] Author: Loren Merritt Date: Sat Sep 10 11:23:09 2005 +0000 SSE2 pixel comparison functions P4: SAD 16x*, SSD 16x*, SATD 16x*: 30% faster, SATD 8x8: 15% faster, total: 2-4% faster K8: SSD 16x*: 6% faster, total: not much patch by Alexander Izvorski. git-svn-id: svn://svn.videolan.org/x264/trunk@292 df754926-b1dd-0310-bc7b-ec298dee348c commit 6db6362c9d558d0acea9be1975344d217f453ab9 [revision 291] Author: Loren Merritt Date: Tue Aug 30 17:11:35 2005 +0000 10l in rev290: duplicate declaration of x264_pixel_sub_8x8_mmx. git-svn-id: svn://svn.videolan.org/x264/trunk@291 df754926-b1dd-0310-bc7b-ec298dee348c commit 2d05702f88d1058b2ecd3945cd01269eb86829bb [revision 290] Author: Loren Merritt Date: Mon Aug 29 20:37:31 2005 +0000 mmx 8x8 dct. On a K8: sub16x16_dct8 3806->1461, add16x16_idct8 4852->1297 cycles. total speedup: 1-3%. patch by Christian Heine (sennindemokrit at gmx dot net) git-svn-id: svn://svn.videolan.org/x264/trunk@290 df754926-b1dd-0310-bc7b-ec298dee348c commit 2e5b0b93384f8d48e32b26beb6badb8a3236c29b [revision 289] Author: Eric Petit Date: Mon Aug 29 13:20:45 2005 +0000 VC++ fix (thx fenrir) git-svn-id: svn://svn.videolan.org/x264/trunk@289 df754926-b1dd-0310-bc7b-ec298dee348c commit 2a6e7a685391f4ae465c79111583c91fb26cb5a8 [revision 288] Author: Eric Petit Date: Mon Aug 29 11:20:23 2005 +0000 x264.h: issue an explicit warning when neither stdint.h nor inttypes.h has be included before x264.h git-svn-id: svn://svn.videolan.org/x264/trunk@288 df754926-b1dd-0310-bc7b-ec298dee348c commit 0403fed87a9cea867afa55d45500f6396c326659 [revision 287] Author: Loren Merritt Date: Wed Aug 17 15:18:42 2005 +0000 VfW: SAR wording. patch by Sharktooth. git-svn-id: svn://svn.videolan.org/x264/trunk@287 df754926-b1dd-0310-bc7b-ec298dee348c commit 48af1d03ed42e14b51f5e9c6986bd910aaab5b7a [revision 286] Author: Loren Merritt Date: Tue Aug 16 15:09:41 2005 +0000 cli: workaround to allow "--ratetol inf" on win32. git-svn-id: svn://svn.videolan.org/x264/trunk@286 df754926-b1dd-0310-bc7b-ec298dee348c commit 796da8ed7e5ab52eb64d232c89e7e38bfa77215c [revision 285] Author: Loren Merritt Date: Tue Aug 9 18:48:57 2005 +0000 fix spatial direct mv prediction with B-pyramid. copied from libavcodec. git-svn-id: svn://svn.videolan.org/x264/trunk@285 df754926-b1dd-0310-bc7b-ec298dee348c commit 1459ac0dbca3f1f31557d9d8bb8911cb980aad6b [revision 284] Author: Laurent Aimar Date: Tue Aug 9 07:20:26 2005 +0000 * all: Patch by Mike Matsnev : "The following things were fixed: * AR calculation was broken on previous import * Wrong conditional in write_nalu_mkv() was fixed * Error checking was added in all places" git-svn-id: svn://svn.videolan.org/x264/trunk@284 df754926-b1dd-0310-bc7b-ec298dee348c commit 47673d940a290207345bb13f08c371aa435e92a2 [revision 283] Author: Laurent Aimar Date: Tue Aug 9 07:17:26 2005 +0000 xyuv: bug fixes + autodetect of video size. git-svn-id: svn://svn.videolan.org/x264/trunk@283 df754926-b1dd-0310-bc7b-ec298dee348c commit d9218cb35688033a78f936e963a4ca3572cfdb29 [revision 282] Author: Eric Petit Date: Sun Aug 7 17:17:05 2005 +0000 Run ranlib after make install (OS X needs that) git-svn-id: svn://svn.videolan.org/x264/trunk@282 df754926-b1dd-0310-bc7b-ec298dee348c commit 205910672b5686174b7d6f0a1960d53cd4bd9f9b [revision 281] Author: Loren Merritt Date: Tue Jul 26 16:07:17 2005 +0000 update i_mb_b16x8_cost_table[] for I8x8 mb type (r278 only fixed a symptom). git-svn-id: svn://svn.videolan.org/x264/trunk@281 df754926-b1dd-0310-bc7b-ec298dee348c commit d945b153baa4a81cb40a92e4c09b0e2f16081408 [revision 280] Author: Laurent Aimar Date: Fri Jul 22 15:51:10 2005 +0000 * all: Added matroska writing. Patch by Mike Matsnev. git-svn-id: svn://svn.videolan.org/x264/trunk@280 df754926-b1dd-0310-bc7b-ec298dee348c commit 5552052c55d550b48cb43d33cb3655ea53e4a273 [revision 279] Author: Laurent Aimar Date: Fri Jul 22 15:48:18 2005 +0000 * pixel.*: "I have completed additonal SAD implementations (8x16, 16x8 and 16x16) using Sparc VIS. Overall speedup is roughly 90% from straight C. I'm doing development and testing on a Sun Fire V220, with 2 * 1.5ghz UltraSPARC-III CPUs. I've hand-unrolled each of the loops. Sun's assembler does not appear to have macro functionality built-in and I didn't want to establish an external dependancy on m4. Please let me know if you run into any trouble with the patch." Patch by Phil Jensen. git-svn-id: svn://svn.videolan.org/x264/trunk@279 df754926-b1dd-0310-bc7b-ec298dee348c commit d2715116f9ef8d96d78e81010eda7fdee83cc212 [revision 278] Author: Laurent Aimar Date: Fri Jul 22 15:43:16 2005 +0000 analyse: "It correct the size of array i_mb_b16x8_cost_table from 16 to 17,otherwise,it can result a mismatch of b16x8 mb type cost and can result memory read overflow on it." Patch by lurui. git-svn-id: svn://svn.videolan.org/x264/trunk@278 df754926-b1dd-0310-bc7b-ec298dee348c commit f52a280836003583a11b93883e68ac23881355ac [revision 277] Author: Laurent Aimar Date: Wed Jul 20 15:39:44 2005 +0000 * x264 compilation on NetBSD. Patch by Mike Matsnev. git-svn-id: svn://svn.videolan.org/x264/trunk@277 df754926-b1dd-0310-bc7b-ec298dee348c commit 300e93ef08f5b389da3942474da8ec6fb9c62fda [revision 276] Author: Laurent Aimar Date: Wed Jul 20 15:27:18 2005 +0000 * all: "8x8 SAD written in Sparc Assembly using VIS." Patch by Phil Jensen. git-svn-id: svn://svn.videolan.org/x264/trunk@276 df754926-b1dd-0310-bc7b-ec298dee348c commit 1a0920f5c5c482d18dcbc775a542cb1529d019d0 [revision 275] Author: Loren Merritt Date: Fri Jul 15 16:21:58 2005 +0000 10l: rd score for sub-8x8 partitions used wrong mvs. git-svn-id: svn://svn.videolan.org/x264/trunk@275 df754926-b1dd-0310-bc7b-ec298dee348c commit 0f34713af75421dcf3db067511d26d08ebe36134 [revision 274] Author: Loren Merritt Date: Wed Jul 13 15:52:59 2005 +0000 faster SAD_INC_2x16P for amd64. patch by Josef Zlomek. git-svn-id: svn://svn.videolan.org/x264/trunk@274 df754926-b1dd-0310-bc7b-ec298dee348c commit 86a01ef552f00fcc3225776bd41d7ebfb6507d0b [revision 273] Author: Eric Petit Date: Sun Jul 10 12:51:21 2005 +0000 Fixed win32 handle leakage (thanks Trax) Default enabled support of threads on BeOS git-svn-id: svn://svn.videolan.org/x264/trunk@273 df754926-b1dd-0310-bc7b-ec298dee348c commit da60272bf0c4b65128d673daef4d4d7c09c13ae3 [revision 272] Author: Laurent Aimar Date: Thu Jul 7 07:48:36 2005 +0000 * Add support for UltraSparc (uname -m: sun4u) with Solaris. Patch by Tuukka Toivonen. git-svn-id: svn://svn.videolan.org/x264/trunk@272 df754926-b1dd-0310-bc7b-ec298dee348c commit 95c407157830f714c4914ceaeb850bebd198d14b [revision 271] Author: Laurent Aimar Date: Thu Jul 7 07:37:54 2005 +0000 * Faster SAD_INC_2x16P. Patch by Alexander Izvorski. git-svn-id: svn://svn.videolan.org/x264/trunk@271 df754926-b1dd-0310-bc7b-ec298dee348c commit 90793358d78f5ad79aef3cc09ea80d5ea81bb53b [revision 270] Author: Loren Merritt Date: Tue Jun 21 14:49:27 2005 +0000 example quant matrix file git-svn-id: svn://svn.videolan.org/x264/trunk@270 df754926-b1dd-0310-bc7b-ec298dee348c commit 7b1b45e8a6fc3e36447b7626617978dd7c9d5958 [revision 269] Author: Loren Merritt Date: Tue Jun 21 08:16:01 2005 +0000 --cqmfile reads quant matrices in a JM-compatible format. git-svn-id: svn://svn.videolan.org/x264/trunk@269 df754926-b1dd-0310-bc7b-ec298dee348c commit 7a77a1e7295b99a418c4fad2a5ab91f0dc896115 [revision 268] Author: Loren Merritt Date: Tue Jun 21 04:45:49 2005 +0000 adjust coded buffer size based on input resolution and QP (old default wasn't enough for HD lossless) git-svn-id: svn://svn.videolan.org/x264/trunk@268 df754926-b1dd-0310-bc7b-ec298dee348c commit ca8ead2eb1ac51d9784af6fe7a6a3df1fbf10ada [revision 267] Author: Loren Merritt Date: Mon Jun 20 00:36:05 2005 +0000 update avc2avi for high profile git-svn-id: svn://svn.videolan.org/x264/trunk@267 df754926-b1dd-0310-bc7b-ec298dee348c commit 1ab01bbc01bc482e9891fe843e1ddd14b7625540 [revision 266] Author: Loren Merritt Date: Mon Jun 20 00:08:28 2005 +0000 custom quant matrices git-svn-id: svn://svn.videolan.org/x264/trunk@266 df754926-b1dd-0310-bc7b-ec298dee348c commit 2c4b31daae223b688feb4a6fdef36fce3b1bc6f0 [revision 265] Author: Loren Merritt Date: Fri Jun 17 08:32:56 2005 +0000 VfW: workaround a windows unicode bug. patch by Leowai. git-svn-id: svn://svn.videolan.org/x264/trunk@265 df754926-b1dd-0310-bc7b-ec298dee348c commit 396133936510d57bc2054dd1c1d3d92fa0eb5495 [revision 264] Author: Loren Merritt Date: Fri Jun 17 08:21:48 2005 +0000 lossless mode enabled at qp=0 git-svn-id: svn://svn.videolan.org/x264/trunk@264 df754926-b1dd-0310-bc7b-ec298dee348c commit 2f9a70c0a5b257eb1413601df191556547f307d5 [revision 263] Author: Loren Merritt Date: Tue Jun 14 19:49:16 2005 +0000 VfW: enable RDO. some option dependencies. patch by Francesco Corriga. git-svn-id: svn://svn.videolan.org/x264/trunk@263 df754926-b1dd-0310-bc7b-ec298dee348c commit 15ecd54fc67e75ccd380a7e36720f1a0c2514f94 [revision 262] Author: Loren Merritt Date: Tue Jun 14 19:19:52 2005 +0000 rate-distortion optimized MB types in I- and P-frames (--subme 6) git-svn-id: svn://svn.videolan.org/x264/trunk@262 df754926-b1dd-0310-bc7b-ec298dee348c commit 41c37d9e05416a71c2499f788ea268032da0a6c4 [revision 261] Author: Loren Merritt Date: Sun Jun 12 23:17:12 2005 +0000 more VfW options. patch mostly by celtic_druid. git-svn-id: svn://svn.videolan.org/x264/trunk@261 df754926-b1dd-0310-bc7b-ec298dee348c commit a296ffcc5aa892d5281a9e6b2b4e863dd94e0b69 [revision 260] Author: Loren Merritt Date: Sat Jun 11 21:17:30 2005 +0000 VFW: 8x8 transform, SAR. patch by celtic_druid. git-svn-id: svn://svn.videolan.org/x264/trunk@260 df754926-b1dd-0310-bc7b-ec298dee348c commit 7832f017704235b31c7a33b54a06ab196c1dcc4a [revision 259] Author: Loren Merritt Date: Sat Jun 11 20:32:22 2005 +0000 threads option in vfw. patch by celtic_druid. git-svn-id: svn://svn.videolan.org/x264/trunk@259 df754926-b1dd-0310-bc7b-ec298dee348c commit 8c6e66479e66da8a9a79eacfec9fc2ff39a24464 [revision 258] Author: Loren Merritt Date: Sat Jun 11 19:27:02 2005 +0000 win32 threads enabled by default git-svn-id: svn://svn.videolan.org/x264/trunk@258 df754926-b1dd-0310-bc7b-ec298dee348c commit 96813e36dc54e1e9866dad24a8c0cc7a748f0d4a [revision 257] Author: Loren Merritt Date: Sat Jun 11 19:15:35 2005 +0000 vfw installer nsis script. patch by Francesco Corriga. git-svn-id: svn://svn.videolan.org/x264/trunk@257 df754926-b1dd-0310-bc7b-ec298dee348c commit 8025723ee4a1c99e3e833ce963d05e5eb8c74606 [revision 256] Author: Loren Merritt Date: Sat Jun 11 05:52:38 2005 +0000 print 8x8 transform usage % in stats summary. git-svn-id: svn://svn.videolan.org/x264/trunk@256 df754926-b1dd-0310-bc7b-ec298dee348c commit 26aa962acdc90204f7c915be91ead00ebcc5f30d [revision 255] Author: Loren Merritt Date: Wed Jun 8 17:16:20 2005 +0000 revert 216, another try at max_dec_frame_buffering. disable adaptive cabac_idc by default; 0 is always best anyway. git-svn-id: svn://svn.videolan.org/x264/trunk@255 df754926-b1dd-0310-bc7b-ec298dee348c commit c4f5de5230b584189c57db18f68d73f19d653d00 [revision 254] Author: Loren Merritt Date: Wed Jun 8 00:38:03 2005 +0000 typo in cabac tables git-svn-id: svn://svn.videolan.org/x264/trunk@254 df754926-b1dd-0310-bc7b-ec298dee348c commit 2b5a6b2bd914a9d3ff9c304062c93f28c58ff532 [revision 253] Author: Loren Merritt Date: Sun Jun 5 20:39:58 2005 +0000 cosmetics git-svn-id: svn://svn.videolan.org/x264/trunk@253 df754926-b1dd-0310-bc7b-ec298dee348c commit 916136c96d49961ff944b6ef2feeedfc7a90af98 [revision 252] Author: Loren Merritt Date: Sun Jun 5 18:39:21 2005 +0000 fix i8x8 decision with chroma_me git-svn-id: svn://svn.videolan.org/x264/trunk@252 df754926-b1dd-0310-bc7b-ec298dee348c commit 8614594835ce25879c0d01ca88625ea444d577f2 [revision 251] Author: Loren Merritt Date: Sun Jun 5 11:07:28 2005 +0000 SATD-based decision for 8x8 transform in inter-MBs. Enable 8x8 intra. CLI options: --8x8dct, --analyse i8x8. git-svn-id: svn://svn.videolan.org/x264/trunk@251 df754926-b1dd-0310-bc7b-ec298dee348c commit 6bf1398b824c013184548277eb8f2dbccd4d6fc5 [revision 250] Author: Eric Petit Date: Sun Jun 5 10:17:10 2005 +0000 Use win32 native threads (you still have to --enable-pthread to use them, though) git-svn-id: svn://svn.videolan.org/x264/trunk@250 df754926-b1dd-0310-bc7b-ec298dee348c commit 46a487299946e8a2130c3629bfaac1252ff068c4 [revision 249] Author: Loren Merritt Date: Sun Jun 5 01:09:38 2005 +0000 slightly faster 8x8 dct git-svn-id: svn://svn.videolan.org/x264/trunk@249 df754926-b1dd-0310-bc7b-ec298dee348c commit 398a6bf064d7ce46b0cb0edc66323473009d5e06 [revision 248] Author: Loren Merritt Date: Sat Jun 4 06:23:56 2005 +0000 remove unused tables from SPS/PPS. reduces overhead when syncing threads. git-svn-id: svn://svn.videolan.org/x264/trunk@248 df754926-b1dd-0310-bc7b-ec298dee348c commit 1729616639eada4977171af3611f3040113f1f01 [revision 247] Author: Loren Merritt Date: Fri Jun 3 09:58:25 2005 +0000 10l (debug stuff in 246) git-svn-id: svn://svn.videolan.org/x264/trunk@247 df754926-b1dd-0310-bc7b-ec298dee348c commit 1ab45c8f7411f7b4453ddff66919910e823ed33b [revision 246] Author: Loren Merritt Date: Fri Jun 3 05:33:15 2005 +0000 8x8 transform and 8x8 intra prediction. (backend only, not yet used by mb analysis) git-svn-id: svn://svn.videolan.org/x264/trunk@246 df754926-b1dd-0310-bc7b-ec298dee348c commit e46db68534f54a52c9df7595d8bd8fd4c8b21b53 [revision 245] Author: Loren Merritt Date: Wed Jun 1 06:49:00 2005 +0000 cosmetics git-svn-id: svn://svn.videolan.org/x264/trunk@245 df754926-b1dd-0310-bc7b-ec298dee348c commit 7f988086c20dc28cafdec793af7900fcb477a25a [revision 244] Author: Loren Merritt Date: Wed Jun 1 05:31:39 2005 +0000 fix a bug with cabac + B-frames + mref + slices. call visualization per frame instead of per slice. git-svn-id: svn://svn.videolan.org/x264/trunk@244 df754926-b1dd-0310-bc7b-ec298dee348c commit b1f4d5b12789e6d608288b71ebefa59acf4fba86 [revision 243] Author: Måns Rullgård Date: Mon May 30 19:47:02 2005 +0000 accept the standard --prefix etc. options git-svn-id: svn://svn.videolan.org/x264/trunk@243 df754926-b1dd-0310-bc7b-ec298dee348c commit c77e709785fab74313a6c443c4f2f00fb9a86b70 [revision 242] Author: Loren Merritt Date: Mon May 30 01:52:00 2005 +0000 tweak cflags git-svn-id: svn://svn.videolan.org/x264/trunk@242 df754926-b1dd-0310-bc7b-ec298dee348c commit e85db920bb31a699b38c057f51a3eb68bb1b719d [revision 241] Author: Eric Petit Date: Sun May 29 20:27:09 2005 +0000 Fixed multithreading on BeOS (pthread emulation required) git-svn-id: svn://svn.videolan.org/x264/trunk@241 df754926-b1dd-0310-bc7b-ec298dee348c commit 10851d0e11e90e814c37695aa244f113b21415f2 [revision 240] Author: Loren Merritt Date: Sun May 29 18:28:49 2005 +0000 multithreading (via slices) git-svn-id: svn://svn.videolan.org/x264/trunk@240 df754926-b1dd-0310-bc7b-ec298dee348c commit 36f6321d4dd1b87331bec691ba1bdd3c6ec19b22 [revision 239] Author: Loren Merritt Date: Tue May 24 05:10:38 2005 +0000 move zones parsing to ratecontrol.c; allows passing in zones as a string. git-svn-id: svn://svn.videolan.org/x264/trunk@239 df754926-b1dd-0310-bc7b-ec298dee348c commit 470e1b284f31e294119c7bc457a762488b34dd60 [revision 238] Author: Loren Merritt Date: Tue May 24 04:16:54 2005 +0000 UMHex motion seach (but no early termination yet) git-svn-id: svn://svn.videolan.org/x264/trunk@238 df754926-b1dd-0310-bc7b-ec298dee348c commit c8b1a477d2d145698b065d7c20cd10be2f75e94d [revision 237] Author: Loren Merritt Date: Tue May 24 01:34:57 2005 +0000 Zoned ratecontrol. git-svn-id: svn://svn.videolan.org/x264/trunk@237 df754926-b1dd-0310-bc7b-ec298dee348c commit 0072b802fb9205be3606f45ec9cc6f5111c3ec3e [revision 236] Author: Loren Merritt Date: Mon May 23 08:57:02 2005 +0000 fix rounding of intra dequant when qp<=3 git-svn-id: svn://svn.videolan.org/x264/trunk@236 df754926-b1dd-0310-bc7b-ec298dee348c commit 7c02f091422b68fa01d48645eb2f04bbf409fb79 [revision 235] Author: Loren Merritt Date: Sat May 21 20:49:06 2005 +0000 API: x264_encoder_reconfig(). (not yet used by any frontend) git-svn-id: svn://svn.videolan.org/x264/trunk@235 df754926-b1dd-0310-bc7b-ec298dee348c commit 7fef6efa884a0fdad75181564a916ac94f81e3b3 [revision 234] Author: Eric Petit Date: Thu May 19 15:42:48 2005 +0000 Makefile: in target "install", first create the directories if they don't already exist git-svn-id: svn://svn.videolan.org/x264/trunk@234 df754926-b1dd-0310-bc7b-ec298dee348c commit 809c516abe16bf051beb9d053d673a26906aa43c [revision 233] Author: Eric Petit Date: Sun May 15 20:19:22 2005 +0000 Optimized subXxX_dct git-svn-id: svn://svn.videolan.org/x264/trunk@233 df754926-b1dd-0310-bc7b-ec298dee348c commit f025abc9c0006c0a67d112afc6daff78c4fa7aad [revision 232] Author: Eric Petit Date: Sat May 14 15:49:36 2005 +0000 s/==/=/ git-svn-id: svn://svn.videolan.org/x264/trunk@232 df754926-b1dd-0310-bc7b-ec298dee348c commit 04ded39b9ba4e8f0b983efcc056292f25d544b9f [revision 231] Author: Eric Petit Date: Sat May 14 07:08:08 2005 +0000 ppc/: compile fixes for Linux/PPC (courtesy of Rasmus Rohde) and for gcc < 4 git-svn-id: svn://svn.videolan.org/x264/trunk@231 df754926-b1dd-0310-bc7b-ec298dee348c commit 94829ef6e277315e635df05d669848b5216f00d3 [revision 230] Author: Loren Merritt Date: Fri May 13 16:54:03 2005 +0000 visualize reference pic numbers. misc cleanups in visualization. patch by Tuukka Toivonen. git-svn-id: svn://svn.videolan.org/x264/trunk@230 df754926-b1dd-0310-bc7b-ec298dee348c commit 4d5c7a033fbe7e7b168381a9fa15e8c2eb1a6a2f [revision 229] Author: Eric Petit Date: Fri May 13 15:30:18 2005 +0000 ppc/*: more tuning on satd (+5%) git-svn-id: svn://svn.videolan.org/x264/trunk@229 df754926-b1dd-0310-bc7b-ec298dee348c commit e0bd8066395df74d5f2edc851c048512a0fed4ba [revision 228] Author: Loren Merritt Date: Fri May 13 08:03:42 2005 +0000 CLI option: --seek git-svn-id: svn://svn.videolan.org/x264/trunk@228 df754926-b1dd-0310-bc7b-ec298dee348c commit 036494a60f7850c1613c5084fe9a11c7821cb5a7 [revision 227] Author: Loren Merritt Date: Thu May 12 23:03:49 2005 +0000 CLI option: --visualize Displays the encoded video along with MB types and motion vectors. patch by Tuukka Toivonen. git-svn-id: svn://svn.videolan.org/x264/trunk@227 df754926-b1dd-0310-bc7b-ec298dee348c commit 31c91bd71f1cc7fd0988892657a3574dc534f628 [revision 226] Author: Loren Merritt Date: Thu May 12 19:48:10 2005 +0000 fix an uninitialized value in slicetype_analyse git-svn-id: svn://svn.videolan.org/x264/trunk@226 df754926-b1dd-0310-bc7b-ec298dee348c commit 92ea0c5c30a74408e931227765009ef8aaee1542 [revision 225] Author: Loren Merritt Date: Wed May 11 17:58:00 2005 +0000 port recent MC asm changes to amd64. patch by Josef Zlomek. git-svn-id: svn://svn.videolan.org/x264/trunk@225 df754926-b1dd-0310-bc7b-ec298dee348c commit d926e41d04312639d762d79af3867d61ce340591 [revision 224] Author: Eric Petit Date: Wed May 11 16:22:18 2005 +0000 ppc/*: + Removed unused code + Optimized mc chroma 4xH and satd 8x4 and 4x8 + Won a bunch of cycles by not trusting gcc about inlining and unrolling properly (about 17% faster globally) git-svn-id: svn://svn.videolan.org/x264/trunk@224 df754926-b1dd-0310-bc7b-ec298dee348c commit aecc6ab057616f32eb0643b36db2d5b04d7a07ea [revision 223] Author: Loren Merritt Date: Wed May 11 15:57:43 2005 +0000 New ratecontrol options: 1pass ABR. VBV constraint for ABR and 2pass. There is no longer a dedicated CBR mode: use ABR+VBV. VfW now uses ABR instead of CQP for 1st of multipass. git-svn-id: svn://svn.videolan.org/x264/trunk@223 df754926-b1dd-0310-bc7b-ec298dee348c commit 540fba7a1404909074eb08e76b98d7f9d36fd5e9 [revision 222] Author: Loren Merritt Date: Wed May 11 00:15:34 2005 +0000 use a predicted mv as starting point for subpel refinement. git-svn-id: svn://svn.videolan.org/x264/trunk@222 df754926-b1dd-0310-bc7b-ec298dee348c commit dcb0aebebeb197c75fc5f0f49185f6afb6fd90ec [revision 221] Author: Loren Merritt Date: Tue May 10 08:21:36 2005 +0000 slight speedup in halfpel interpolation. patch by Mathieu Monnier. git-svn-id: svn://svn.videolan.org/x264/trunk@221 df754926-b1dd-0310-bc7b-ec298dee348c commit 22a567bbe57fec9cf4beacca7517cc6d9139e091 [revision 220] Author: Loren Merritt Date: Fri May 6 19:38:40 2005 +0000 Cleaner allocation of tmp space in halfpel interpolation; fixes some valgrind/nasm warnings. patch by Mathieu Monnier. git-svn-id: svn://svn.videolan.org/x264/trunk@220 df754926-b1dd-0310-bc7b-ec298dee348c commit ca4a34dfe0e6d93ce7598dd18c3c6af8c611d7e5 [revision 219] Author: Loren Merritt Date: Tue May 3 08:25:31 2005 +0000 "2pass failed to converge" is no longer considered fatal. git-svn-id: svn://svn.videolan.org/x264/trunk@219 df754926-b1dd-0310-bc7b-ec298dee348c commit ab2cdf4b804f9e97a112fa4be96c1306522746e4 [revision 218] Author: Loren Merritt Date: Sat Apr 30 01:20:50 2005 +0000 Updated MSVC project files. thanks to Bonzi. git-svn-id: svn://svn.videolan.org/x264/trunk@218 df754926-b1dd-0310-bc7b-ec298dee348c commit e0a640413f484d1db034a9ecbd0fa472204f273a [revision 217] Author: Loren Merritt Date: Mon Apr 25 18:39:32 2005 +0000 cosmetics. silence some gcc warnings. amd64 doesn't need a separate copy of the c/h files, only the asm. git-svn-id: svn://svn.videolan.org/x264/trunk@217 df754926-b1dd-0310-bc7b-ec298dee348c commit d2ad6a20941a4f25b69c88d136e7450d10b035be [revision 216] Author: Loren Merritt Date: Fri Apr 22 04:05:35 2005 +0000 10l (214 wrote wrong DPB size in SPS -> B-pyramid broke) git-svn-id: svn://svn.videolan.org/x264/trunk@216 df754926-b1dd-0310-bc7b-ec298dee348c commit 7229a11c2fe117e0511cd76fa264baf25be92a5f [revision 215] Author: Loren Merritt Date: Thu Apr 21 09:20:43 2005 +0000 CLI (mp4): return to 'capture' output mode, remove useless SetCtsPackMode() (fixed in gpac). Note: requires gpac cvs-20050419 or later. patch by bobo. git-svn-id: svn://svn.videolan.org/x264/trunk@215 df754926-b1dd-0310-bc7b-ec298dee348c commit 9b44391701779bfb0d291592d1d81c70bcf6c116 [revision 214] Author: Loren Merritt Date: Tue Apr 19 23:09:29 2005 +0000 combined L0 & L1 reference lists are limited to a total of 16 pics. git-svn-id: svn://svn.videolan.org/x264/trunk@214 df754926-b1dd-0310-bc7b-ec298dee348c commit 41f9b8134c332599555bb44c3d0b8e94af44ebf9 [revision 213] Author: Loren Merritt Date: Tue Apr 19 18:44:42 2005 +0000 amd64 asm patch, part2. by Josef Zlomek ( josef dot zlomek at xeris dot cz ) git-svn-id: svn://svn.videolan.org/x264/trunk@213 df754926-b1dd-0310-bc7b-ec298dee348c commit 413d8fa90917044e0ffaffb7009ccbc8059c61b0 [revision 212] Author: Loren Merritt Date: Tue Apr 19 18:35:45 2005 +0000 amd64 asm patch, part1. git-svn-id: svn://svn.videolan.org/x264/trunk@212 df754926-b1dd-0310-bc7b-ec298dee348c commit 7d35ba6bf080610d8f144f4270e961c69ba14f1c [revision 211] Author: Loren Merritt Date: Tue Apr 19 08:45:36 2005 +0000 Allow manual selection of fullpel ME method. New method: Exhaustive search. based on a patch by Tuukka Toivonen. git-svn-id: svn://svn.videolan.org/x264/trunk@211 df754926-b1dd-0310-bc7b-ec298dee348c commit 0c641421898f5c3087d52abcfd35ab617d101010 [revision 210] Author: Loren Merritt Date: Tue Apr 19 01:42:12 2005 +0000 misc makefile changes. propogate --extra-cflags to vfw. 'make clean' removes x264.exe and vfw. tweak dependencies. git-svn-id: svn://svn.videolan.org/x264/trunk@210 df754926-b1dd-0310-bc7b-ec298dee348c commit 1519835f0fa218993ed031a2247ec88eb5906dd7 [revision 209] Author: Loren Merritt Date: Mon Apr 18 02:00:58 2005 +0000 10l (CLI: fflush after progress update) git-svn-id: svn://svn.videolan.org/x264/trunk@209 df754926-b1dd-0310-bc7b-ec298dee348c commit da4c0384503bd7b2fa7752ef2045e5060e5df0cd [revision 208] Author: Loren Merritt Date: Sun Apr 17 18:43:17 2005 +0000 CLI: progress indicator git-svn-id: svn://svn.videolan.org/x264/trunk@208 df754926-b1dd-0310-bc7b-ec298dee348c commit a61378bea90edd13a0e9b907917f7645e9266750 [revision 207] Author: Loren Merritt Date: Sat Apr 16 20:21:06 2005 +0000 VfW: build from main makefile git-svn-id: svn://svn.videolan.org/x264/trunk@207 df754926-b1dd-0310-bc7b-ec298dee348c commit c6f3d17ffa67ad27f126bf579a08a443023ad0d3 [revision 206] Author: Eric Petit Date: Fri Apr 15 17:26:09 2005 +0000 [mp4] ftyp & moov boxes at the begining of the file, (thanks to jeanlf for comments) patch by bobololo git-svn-id: svn://svn.videolan.org/x264/trunk@206 df754926-b1dd-0310-bc7b-ec298dee348c commit 74eecd32358de0799a1b9bad041ebb6550002769 [revision 205] Author: Loren Merritt Date: Thu Apr 14 23:04:48 2005 +0000 CLI: --fps had side-effects. fixed. git-svn-id: svn://svn.videolan.org/x264/trunk@205 df754926-b1dd-0310-bc7b-ec298dee348c commit 78ca42c56ec53e153fef1b2a1a612191c840d797 [revision 204] Author: Loren Merritt Date: Thu Apr 14 21:59:00 2005 +0000 CLI: cosmetics git-svn-id: svn://svn.videolan.org/x264/trunk@204 df754926-b1dd-0310-bc7b-ec298dee348c commit e06dfd4ac1cd0c80525f2dfbacbce28c543770fc [revision 203] Author: Loren Merritt Date: Thu Apr 14 19:45:08 2005 +0000 Makefile: strip x264cli. tweak stats summary. git-svn-id: svn://svn.videolan.org/x264/trunk@203 df754926-b1dd-0310-bc7b-ec298dee348c commit 29facf8bf218a7c7c47ca48c8b7abb6672d6544e [revision 202] Author: Laurent Aimar Date: Wed Apr 13 14:25:32 2005 +0000 * x264.c: Fix ctts box creation. Patch by bobololo from Ateme. git-svn-id: svn://svn.videolan.org/x264/trunk@202 df754926-b1dd-0310-bc7b-ec298dee348c commit 1014aa4e4fa0097e98754afbcf68245a14480710 [revision 201] Author: Eric Petit Date: Wed Apr 13 03:43:07 2005 +0000 common/ppc: more cleaning, optimized a bit git-svn-id: svn://svn.videolan.org/x264/trunk@201 df754926-b1dd-0310-bc7b-ec298dee348c commit 77404162c8588abc9b720b88e20fac34dfe31139 [revision 200] Author: Loren Merritt Date: Tue Apr 12 20:38:40 2005 +0000 CLI: require output file (don't default to stdout). warn if trying to use mp4 or avis when not supported. misc cleanup. git-svn-id: svn://svn.videolan.org/x264/trunk@200 df754926-b1dd-0310-bc7b-ec298dee348c commit fe905276b25c8aa202379d0b5c0115d7b5b631c8 [revision 199] Author: Eric Petit Date: Tue Apr 12 18:45:24 2005 +0000 configure: use -falign-loops=16 on OS X common/ppc/: added AltiVecized mc_chroma + cleaning checkasm.c: really fixed MC tests git-svn-id: svn://svn.videolan.org/x264/trunk@199 df754926-b1dd-0310-bc7b-ec298dee348c commit a1b9531707b835e6934cadfb78249149f6351d7e [revision 198] Author: Loren Merritt Date: Tue Apr 12 17:33:10 2005 +0000 Configure tweaks. Allow avis-input in mingw. Turn off debug by default. git-svn-id: svn://svn.videolan.org/x264/trunk@198 df754926-b1dd-0310-bc7b-ec298dee348c commit 35d85ca65d77f4013cfc37b2dd76b9ef87db144d [revision 197] Author: Eric Petit Date: Tue Apr 12 16:34:48 2005 +0000 checkasm.c: fixed MC tests git-svn-id: svn://svn.videolan.org/x264/trunk@197 df754926-b1dd-0310-bc7b-ec298dee348c commit c0abfd39627fcb3e2f6c9aed7ebbed7dfda9230e [revision 196] Author: Loren Merritt Date: Tue Apr 12 03:34:25 2005 +0000 CLI: MP4 muxing. patch by bobo from Ateme. git-svn-id: svn://svn.videolan.org/x264/trunk@196 df754926-b1dd-0310-bc7b-ec298dee348c commit e1b747ff05b28ee786425d48be53376c620a1cdc [revision 195] Author: Eric Petit Date: Mon Apr 11 21:21:05 2005 +0000 Cygwin fixes git-svn-id: svn://svn.videolan.org/x264/trunk@195 df754926-b1dd-0310-bc7b-ec298dee348c commit b7c3b444753d5ddce3b87249c96a207c85301075 [revision 194] Author: Eric Petit Date: Mon Apr 11 20:52:31 2005 +0000 configure: ooops, restored -g ratecontrol.c: OS X has exp2f in -lmx checkasm: quick compile fix git-svn-id: svn://svn.videolan.org/x264/trunk@194 df754926-b1dd-0310-bc7b-ec298dee348c commit ecbf942b1e46e1a4df0e8fd87db538342d968059 [revision 193] Author: Måns Rullgård Date: Mon Apr 11 20:00:49 2005 +0000 add x86_64 to configure git-svn-id: svn://svn.videolan.org/x264/trunk@193 df754926-b1dd-0310-bc7b-ec298dee348c commit 7d9ac7c215bc0b77c538d100c92498e847e1cfa8 [revision 192] Author: Eric Petit Date: Mon Apr 11 19:41:28 2005 +0000 set svn:ignore git-svn-id: svn://svn.videolan.org/x264/trunk@192 df754926-b1dd-0310-bc7b-ec298dee348c commit 6490f4398d9e28e65d7517849e729e14eede8c5b [revision 191] Author: Eric Petit Date: Mon Apr 11 19:28:03 2005 +0000 Added a configure to detect the platform/system/etc so people don't have to edit the Makefile (will work for Linux/OS X/BeOS/FreeBSD, feel free to modify for others), and we can now remove the Jamfile which was broken most of the time anyway. git-svn-id: svn://svn.videolan.org/x264/trunk@191 df754926-b1dd-0310-bc7b-ec298dee348c commit b12cb05a8fee91c50dc3d1d3c2569a801cc1a5e3 [revision 190] Author: Loren Merritt Date: Sun Apr 10 23:35:01 2005 +0000 Makefiles: better dependencies for SEI version number git-svn-id: svn://svn.videolan.org/x264/trunk@190 df754926-b1dd-0310-bc7b-ec298dee348c commit 90a6fd3e4e8685f990c7f9fe05c8718e77c0e080 [revision 189] Author: Måns Rullgård Date: Thu Apr 7 23:26:51 2005 +0000 Forgot rbsp_trailing_bits in AUD NAL git-svn-id: svn://svn.videolan.org/x264/trunk@189 df754926-b1dd-0310-bc7b-ec298dee348c commit e103917aa0cbb702ba09c2507565398d7f129c2e [revision 188] Author: Måns Rullgård Date: Thu Apr 7 23:11:06 2005 +0000 Optionally use access unit delimiter NAL units. git-svn-id: svn://svn.videolan.org/x264/trunk@188 df754926-b1dd-0310-bc7b-ec298dee348c commit d4663a41a4bb0c67eb861046ed2917111257883f [revision 187] Author: Loren Merritt Date: Tue Apr 5 21:32:52 2005 +0000 VfW: cleaner install on win98. patch by Riccardo Stievano. git-svn-id: svn://svn.videolan.org/x264/trunk@187 df754926-b1dd-0310-bc7b-ec298dee348c commit 990e58b646629a2937e76794b97892d7806a932e [revision 186] Author: Loren Merritt Date: Tue Apr 5 20:50:13 2005 +0000 new util: countquant for 2pass statsfiles git-svn-id: svn://svn.videolan.org/x264/trunk@186 df754926-b1dd-0310-bc7b-ec298dee348c commit b780e711dd0a1e97535c690f84e9726eefa95c2c [revision 185] Author: Loren Merritt Date: Tue Apr 5 20:39:47 2005 +0000 print svn version number in SEI info and in CLI/VfW. git-svn-id: svn://svn.videolan.org/x264/trunk@185 df754926-b1dd-0310-bc7b-ec298dee348c commit ea9308c6b3bfc891a2dcebe1dc89e0c301c57066 [revision 184] Author: Måns Rullgård Date: Thu Mar 31 21:20:41 2005 +0000 Make reconstructed frame available to caller. git-svn-id: svn://svn.videolan.org/x264/trunk@184 df754926-b1dd-0310-bc7b-ec298dee348c commit 6dcb0e4b6d827b9c79f402ff91049b2830b8a743 [revision 183] Author: Loren Merritt Date: Thu Mar 31 06:03:22 2005 +0000 make install git-svn-id: svn://svn.videolan.org/x264/trunk@183 df754926-b1dd-0310-bc7b-ec298dee348c commit 11de51977d28b9ff242aa137f9c270b0f1b3f465 [revision 182] Author: Loren Merritt Date: Thu Mar 31 05:59:11 2005 +0000 free() -> x264_free() git-svn-id: svn://svn.videolan.org/x264/trunk@182 df754926-b1dd-0310-bc7b-ec298dee348c commit de97a12a8b976acad6afdbeda54e4bfbdd9bf8b5 [revision 181] Author: Loren Merritt Date: Mon Mar 28 05:08:43 2005 +0000 CLI: flush B-frames at the end of the encode git-svn-id: svn://svn.videolan.org/x264/trunk@181 df754926-b1dd-0310-bc7b-ec298dee348c commit 0df24cf995faf3169fe15d808e4fff00c18ad7dc [revision 180] Author: Loren Merritt Date: Sun Mar 27 20:49:59 2005 +0000 convert mc's inline asm to nasm (slight speedup and msvc compatibility). patch by Mathieu Monnier. git-svn-id: svn://svn.videolan.org/x264/trunk@180 df754926-b1dd-0310-bc7b-ec298dee348c commit 48c34d0bffd57ba7c73f20bd6c892b4b06131140 [revision 179] Author: Loren Merritt Date: Sun Mar 27 06:58:35 2005 +0000 buffer overruns in slicetype_decision. patch by Mathieu Monnier. git-svn-id: svn://svn.videolan.org/x264/trunk@179 df754926-b1dd-0310-bc7b-ec298dee348c commit a1c2c04693de8fe2d7712249c06c7a6406d0b422 [revision 178] Author: Loren Merritt Date: Thu Mar 17 17:35:11 2005 +0000 tweak usage message git-svn-id: svn://svn.videolan.org/x264/trunk@178 df754926-b1dd-0310-bc7b-ec298dee348c commit ac93ce1bb01701ddc0faa79eeb1079288b6e3543 [revision 177] Author: Loren Merritt Date: Wed Mar 16 22:02:02 2005 +0000 Simplify inter analysis option names. (psub16x16 -> p8x8) patch by Robert Swain. git-svn-id: svn://svn.videolan.org/x264/trunk@177 df754926-b1dd-0310-bc7b-ec298dee348c commit 04557605de60718c172ce6d1fc26b30d6fd2ee8b [revision 176] Author: Loren Merritt Date: Wed Mar 16 21:52:59 2005 +0000 173 broke .depend when debugging was enabled git-svn-id: svn://svn.videolan.org/x264/trunk@176 df754926-b1dd-0310-bc7b-ec298dee348c commit bf7f679c793a2db2580e00f87eb3bed45b47a805 [revision 175] Author: Loren Merritt Date: Wed Mar 16 20:50:19 2005 +0000 early termination for intra4x4 analysis git-svn-id: svn://svn.videolan.org/x264/trunk@175 df754926-b1dd-0310-bc7b-ec298dee348c commit ee5b2be9406eb8b9b11180f406febc944fd8845d [revision 174] Author: Måns Rullgård Date: Tue Mar 15 12:09:00 2005 +0000 Check/fix range of x264_param_t.rc.i_qp_constant. git-svn-id: svn://svn.videolan.org/x264/trunk@174 df754926-b1dd-0310-bc7b-ec298dee348c commit 94086b8bb5885f76093e74b5a5b0f4d4db287c95 [revision 173] Author: Eric Petit Date: Tue Mar 15 07:21:18 2005 +0000 Cleaned up and fixed Makefile for OS X and BeOS (hopefully FreeBSD too) It defaults for x86/linux, others: uncomment the lines for your platform & OS at the beginning of the Makefile git-svn-id: svn://svn.videolan.org/x264/trunk@173 df754926-b1dd-0310-bc7b-ec298dee348c commit cb6a40f00d1f5f14b9c14974309b43955a0b83ed [revision 172] Author: Loren Merritt Date: Tue Mar 15 02:30:16 2005 +0000 macroblock_analyse: simplify cost comparisons. (cosmetic) CLI: enable cabac by default. git-svn-id: svn://svn.videolan.org/x264/trunk@172 df754926-b1dd-0310-bc7b-ec298dee348c commit 79fa69451ad4552c2dd84fcd3c5e75da136af17f [revision 171] Author: Loren Merritt Date: Mon Mar 14 22:47:19 2005 +0000 Chroma ME (P-frames only). git-svn-id: svn://svn.videolan.org/x264/trunk@171 df754926-b1dd-0310-bc7b-ec298dee348c commit abbd6c56da04a9e10d10a4bd158104826e8fc81a [revision 170] Author: Loren Merritt Date: Mon Mar 14 13:05:57 2005 +0000 SSE optimized chroma MC. patch by Radek Czyz. git-svn-id: svn://svn.videolan.org/x264/trunk@170 df754926-b1dd-0310-bc7b-ec298dee348c commit 553b8295bac6b6fd9d91e591bca1299923f0fc96 [revision 169] Author: Loren Merritt Date: Sun Mar 13 23:36:42 2005 +0000 167 broke psnr calculation for non-mod-32 inputs git-svn-id: svn://svn.videolan.org/x264/trunk@169 df754926-b1dd-0310-bc7b-ec298dee348c commit 70da43b22cd394160c4358a33330446bc104c78e [revision 168] Author: Eric Petit Date: Sun Mar 13 18:49:51 2005 +0000 sqrtf requires -lmx on Mac OS X git-svn-id: svn://svn.videolan.org/x264/trunk@168 df754926-b1dd-0310-bc7b-ec298dee348c commit e72f431c685731663d2824aa768218927490e704 [revision 167] Author: Loren Merritt Date: Sun Mar 13 10:25:11 2005 +0000 use mmx ssd for psnr calculation. git-svn-id: svn://svn.videolan.org/x264/trunk@167 df754926-b1dd-0310-bc7b-ec298dee348c commit be2f0e088810860ab760d8d362a9450aaf917a29 [revision 166] Author: Loren Merritt Date: Sun Mar 13 08:26:52 2005 +0000 revert 164. blame Spyder. git-svn-id: svn://svn.videolan.org/x264/trunk@166 df754926-b1dd-0310-bc7b-ec298dee348c commit 73522c84014f240abe7ee70c6e98657b08f97b44 [revision 165] Author: Loren Merritt Date: Sun Mar 13 07:04:16 2005 +0000 SSD comparison function (not yet used). Cosmetics in mmx SAD. git-svn-id: svn://svn.videolan.org/x264/trunk@165 df754926-b1dd-0310-bc7b-ec298dee348c commit c68f34e555e22f4687d985ade6d81ea87cc73f29 [revision 164] Author: Loren Merritt Date: Sat Mar 12 00:23:50 2005 +0000 VfW: reject YUY2 and RGB input formats git-svn-id: svn://svn.videolan.org/x264/trunk@164 df754926-b1dd-0310-bc7b-ec298dee348c commit fd527e3760074a19637c503d0f828d97b7c079fd [revision 163] Author: Måns Rullgård Date: Fri Mar 11 18:10:35 2005 +0000 Really fix QP override. git-svn-id: svn://svn.videolan.org/x264/trunk@163 df754926-b1dd-0310-bc7b-ec298dee348c commit a2245645c8b3948de32f2c27f8cb0acb86e4d2d4 [revision 162] Author: Loren Merritt Date: Fri Mar 11 02:15:25 2005 +0000 write VUI bitstream restrictions git-svn-id: svn://svn.videolan.org/x264/trunk@162 df754926-b1dd-0310-bc7b-ec298dee348c commit 29dee22af6b6174f54bb621f1038c0604a42d21e [revision 161] Author: Loren Merritt Date: Thu Mar 10 23:03:55 2005 +0000 AVI & Avisynth input (win32 only). patch by bobo from Ateme. git-svn-id: svn://svn.videolan.org/x264/trunk@161 df754926-b1dd-0310-bc7b-ec298dee348c commit 79ebb19964a115ab8de21fe1e90162ff9954b283 [revision 160] Author: Loren Merritt Date: Thu Mar 10 21:42:24 2005 +0000 expose option "chroma qp offset" git-svn-id: svn://svn.videolan.org/x264/trunk@160 df754926-b1dd-0310-bc7b-ec298dee348c commit 2fc52995de23f963e67ac408dc247ee3bf68c952 [revision 159] Author: Måns Rullgård Date: Thu Mar 10 19:42:05 2005 +0000 Fix per-frame QP override broken in rev 137. git-svn-id: svn://svn.videolan.org/x264/trunk@159 df754926-b1dd-0310-bc7b-ec298dee348c commit 99b10e79b55520264a29ae4b82d67cd60005faab [revision 158] Author: Måns Rullgård Date: Tue Mar 8 01:08:40 2005 +0000 Don't include x264.o in the library. git-svn-id: svn://svn.videolan.org/x264/trunk@158 df754926-b1dd-0310-bc7b-ec298dee348c commit 9f97e90ef5f3df22a560c10ad49a658041c88629 [revision 157] Author: Loren Merritt Date: Sun Mar 6 21:07:10 2005 +0000 VfW: expose B pyramid and weighted B prediction. patch by Riccardo Stievano. git-svn-id: svn://svn.videolan.org/x264/trunk@157 df754926-b1dd-0310-bc7b-ec298dee348c commit 4fbdc5c1ee77497e6455cd72a895383fb99a77fe [revision 156] Author: Loren Merritt Date: Sun Mar 6 11:39:08 2005 +0000 10l git-svn-id: svn://svn.videolan.org/x264/trunk@156 df754926-b1dd-0310-bc7b-ec298dee348c commit 1f735a32c9626b86b608c9604170b3f4c4549159 [revision 155] Author: Loren Merritt Date: Sun Mar 6 09:50:17 2005 +0000 buffer overrun when bframes == X264_BFRAME_MAX git-svn-id: svn://svn.videolan.org/x264/trunk@155 df754926-b1dd-0310-bc7b-ec298dee348c commit c90534d6c85664c7a161cbe70a7928cb65f19e18 [revision 154] Author: Loren Merritt Date: Sun Mar 6 05:12:25 2005 +0000 Adaptive B skipped some POC numbers (slightly reducing b_direct efficiency). git-svn-id: svn://svn.videolan.org/x264/trunk@154 df754926-b1dd-0310-bc7b-ec298dee348c commit d0bd44f769543e81280a5a97bbe985c6dfd86cf1 [revision 153] Author: Loren Merritt Date: Sat Mar 5 09:34:53 2005 +0000 avc2avi: Use POC to determine frame boundaries (frame_num couldn't distinguish consecutive B-frames). Fix keyframe flag to mark IDR only, not all I slices. git-svn-id: svn://svn.videolan.org/x264/trunk@153 df754926-b1dd-0310-bc7b-ec298dee348c commit f01e3d5f2bffe3a033ecbaa608be6b4f3aca9c60 [revision 152] Author: Loren Merritt Date: Sat Mar 5 04:16:05 2005 +0000 allow 16 refs (instead of 15) git-svn-id: svn://svn.videolan.org/x264/trunk@152 df754926-b1dd-0310-bc7b-ec298dee348c commit c47bb1ffbe630609fda9bd7c9488bae7f0078a4e [revision 151] Author: Loren Merritt Date: Sat Mar 5 00:37:25 2005 +0000 report version number in decimal instead of hex git-svn-id: svn://svn.videolan.org/x264/trunk@151 df754926-b1dd-0310-bc7b-ec298dee348c commit 91536acdb42ec9615a50f5b9f3af34b6c6408049 [revision 150] Author: Loren Merritt Date: Fri Mar 4 12:52:35 2005 +0000 New option: "B-frame pyramid" keeps the middle of 2+ consecutive B-frames as a reference, and reorders frame appropriately. git-svn-id: svn://svn.videolan.org/x264/trunk@150 df754926-b1dd-0310-bc7b-ec298dee348c commit 9591b0383829c707791b7797a68a79008349e198 [revision 149] Author: Loren Merritt Date: Thu Mar 3 04:36:46 2005 +0000 smarter parsing of resolution from commandline git-svn-id: svn://svn.videolan.org/x264/trunk@149 df754926-b1dd-0310-bc7b-ec298dee348c commit 4337ee8de793cb5c6f0dee3b0a851041466fec7e [revision 148] Author: Eric Petit Date: Thu Mar 3 03:02:27 2005 +0000 ratecontrol.c: fixed exp2f on BeOS so rate control works properly git-svn-id: svn://svn.videolan.org/x264/trunk@148 df754926-b1dd-0310-bc7b-ec298dee348c commit 4b2ba852564a05a651f9312651cc402043089648 [revision 147] Author: Loren Merritt Date: Wed Mar 2 22:44:31 2005 +0000 Fix a buffer overrun with very long MVs. git-svn-id: svn://svn.videolan.org/x264/trunk@147 df754926-b1dd-0310-bc7b-ec298dee348c commit ccf61cef868c38bb71d746fcc03f583d93fd3e4c [revision 146] Author: Loren Merritt Date: Mon Feb 28 19:01:58 2005 +0000 wrong stride in lowres image git-svn-id: svn://svn.videolan.org/x264/trunk@146 df754926-b1dd-0310-bc7b-ec298dee348c commit b04a2601088a855361169a3eb5236e8b998f7e70 [revision 145] Author: Loren Merritt Date: Mon Feb 28 18:50:55 2005 +0000 10l (fast1stpass was slower than non-fast) git-svn-id: svn://svn.videolan.org/x264/trunk@145 df754926-b1dd-0310-bc7b-ec298dee348c commit d05adbc7f35e461879f1559a095b82b7253d78cd [revision 144] Author: Loren Merritt Date: Fri Feb 25 03:10:04 2005 +0000 Disable deblocking filter in frames of sufficiently low QP that it would have no effect. (Saves a little CPU time in the decoder.) git-svn-id: svn://svn.videolan.org/x264/trunk@144 df754926-b1dd-0310-bc7b-ec298dee348c commit d836d8f9b7ae3962ac0f5a325f43ca9d6a87a7ff [revision 143] Author: Loren Merritt Date: Fri Feb 25 00:46:56 2005 +0000 Simplify x264_frame_expand_border. git-svn-id: svn://svn.videolan.org/x264/trunk@143 df754926-b1dd-0310-bc7b-ec298dee348c commit 067f22c153eaf19e1ba5ec35deef96a8fb3eae4e [revision 142] Author: Loren Merritt Date: Thu Feb 24 13:09:55 2005 +0000 Altivec functions for MC using the cached halfpel planes. Patch by Fredrik Pettersson . git-svn-id: svn://svn.videolan.org/x264/trunk@142 df754926-b1dd-0310-bc7b-ec298dee348c commit 323b54ffa0bbcfe82b02cb0d204e9ba5121264fd [revision 141] Author: Loren Merritt Date: Thu Feb 24 13:01:21 2005 +0000 Don't use uninitialize MVs in x264_mb_predict_mv_ref16x16. git-svn-id: svn://svn.videolan.org/x264/trunk@141 df754926-b1dd-0310-bc7b-ec298dee348c commit 92f6f36f1d58fd9809263aba16ddeb78ec2ee47d [revision 140] Author: Loren Merritt Date: Thu Feb 24 13:00:34 2005 +0000 Implicit weights in B16x16 analysis were swapped. patch by Radek Czyz. git-svn-id: svn://svn.videolan.org/x264/trunk@140 df754926-b1dd-0310-bc7b-ec298dee348c commit c2b0c8a0e11ef079e82789f79d54c30c9b2364ae [revision 139] Author: Loren Merritt Date: Thu Feb 24 08:31:12 2005 +0000 Cosmetics: Some renaming. Move the rest of slice type decision from encoder.c to slicetype_decision.c git-svn-id: svn://svn.videolan.org/x264/trunk@139 df754926-b1dd-0310-bc7b-ec298dee348c commit 24a6672ecaa6bccc65c4043248c1787e3161062c [revision 138] Author: Loren Merritt Date: Thu Feb 24 08:17:31 2005 +0000 Take into account keyint_max in B-frame decision. git-svn-id: svn://svn.videolan.org/x264/trunk@138 df754926-b1dd-0310-bc7b-ec298dee348c commit 68c13530b5ffc28325aee408f4cd19ab7da06715 [revision 137] Author: Loren Merritt Date: Wed Feb 23 19:58:02 2005 +0000 Preliminary adaptive B-frame decision (not yet tuned). Fix flushing of delayed frames when the encode finishes. git-svn-id: svn://svn.videolan.org/x264/trunk@137 df754926-b1dd-0310-bc7b-ec298dee348c commit e2efb4b7d5885112a32b5b710958fe9fa5458bbf [revision 136] Author: Loren Merritt Date: Tue Feb 22 22:08:07 2005 +0000 Write x264's version in a SEI message. git-svn-id: svn://svn.videolan.org/x264/trunk@136 df754926-b1dd-0310-bc7b-ec298dee348c commit 3ada9c0514d0d785dec7de1f5d092fbda7a629cb [revision 135] Author: Loren Merritt Date: Tue Feb 22 10:46:28 2005 +0000 VfW: Enable weighted B prediction when max B-frames > 1. Enforce max reference frames <= 15. patch by Riccardo Stievano. git-svn-id: svn://svn.videolan.org/x264/trunk@135 df754926-b1dd-0310-bc7b-ec298dee348c commit 834eac288ff5e8d40a1a751d61a59d77d67c0537 [revision 134] Author: Loren Merritt Date: Tue Feb 22 05:19:02 2005 +0000 Add: implicit weighted prediction for B-frames. Slightly optimize x264_mb_mc_01xywh. Fix an error in B16x8 cost. git-svn-id: svn://svn.videolan.org/x264/trunk@134 df754926-b1dd-0310-bc7b-ec298dee348c commit 47706e75fdf80b0c0011e2d697e5e181060a08fe [revision 133] Author: Loren Merritt Date: Sun Feb 20 01:52:12 2005 +0000 Oops, increment API number. git-svn-id: svn://svn.videolan.org/x264/trunk@133 df754926-b1dd-0310-bc7b-ec298dee348c commit d7443f67331e392b580564f815a34c5762f71f03 [revision 132] Author: Loren Merritt Date: Sun Feb 20 01:26:03 2005 +0000 Configurable level. Levels are still not enforced; it's up to the user to select a level compatible with the rest of the encoding options. Patch by Jeff Clagg . git-svn-id: svn://svn.videolan.org/x264/trunk@132 df754926-b1dd-0310-bc7b-ec298dee348c commit 15450dbe916e971793989dc44762e1bda23ca153 [revision 131] Author: Loren Merritt Date: Sat Feb 19 06:18:22 2005 +0000 Always use the tempfile and rename method for multipass stats, so that VfW knows whether the previous pass completed. git-svn-id: svn://svn.videolan.org/x264/trunk@131 df754926-b1dd-0310-bc7b-ec298dee348c commit b1f47ea51c7057ebc0d8938a224662cc6fe23c80 [revision 130] Author: Loren Merritt Date: Fri Feb 18 07:47:35 2005 +0000 More tweaks to bitrate prediction. Change error messages when 2pass fails to converge. git-svn-id: svn://svn.videolan.org/x264/trunk@130 df754926-b1dd-0310-bc7b-ec298dee348c commit 0606b3ac325a3cb3fd1fe648d9a6468ab731f7d5 [revision 129] Author: Loren Merritt Date: Thu Feb 17 19:31:15 2005 +0000 Improved 2pass bitrate predictor. No real change most of the time, but allows correct ratecontrol on some pathological videos that used to diverge completely. Also improves prediction when 2nd pass bitrate is very different from 1st pass. The new qscale2bits() has no simple inverse, so I also had to change rc_eq to output qscale instead of bits. git-svn-id: svn://svn.videolan.org/x264/trunk@129 df754926-b1dd-0310-bc7b-ec298dee348c commit 3b2116cdd0aceff59036a17c6f9aa32592de4851 [revision 128] Author: Loren Merritt Date: Wed Feb 16 04:59:21 2005 +0000 Some defines needed by MSVC, and convert the DSP files to DOS-style newlines. Patch by Radek Czyz. git-svn-id: svn://svn.videolan.org/x264/trunk@128 df754926-b1dd-0310-bc7b-ec298dee348c commit d688918e861714e23f8fa7bdaaa6bf47ffec0395 [revision 127] Author: Loren Merritt Date: Mon Feb 14 23:32:38 2005 +0000 Precalculate lambda*bits for all allowed mvs. 1-2% speedup. git-svn-id: svn://svn.videolan.org/x264/trunk@127 df754926-b1dd-0310-bc7b-ec298dee348c commit ac411e297aaaec200b33b6dab082e12c55c3b7ef [revision 126] Author: Loren Merritt Date: Mon Feb 14 11:08:00 2005 +0000 Deblock B-frames. (Not yet used, since B-frames aren't kept as references.) git-svn-id: svn://svn.videolan.org/x264/trunk@126 df754926-b1dd-0310-bc7b-ec298dee348c commit 50a924885b78abc24dba59bb6717095bcde15d1b [revision 125] Author: Loren Merritt Date: Mon Feb 14 05:58:50 2005 +0000 Simplify x264_mb_mc_01xywh() git-svn-id: svn://svn.videolan.org/x264/trunk@125 df754926-b1dd-0310-bc7b-ec298dee348c commit b2d78b5c7a423a75f0cb555173d92011b4accc44 [revision 124] Author: Loren Merritt Date: Mon Feb 14 04:10:15 2005 +0000 Save some memcopies in halfpel ME. Patch by Radek Czyz. git-svn-id: svn://svn.videolan.org/x264/trunk@124 df754926-b1dd-0310-bc7b-ec298dee348c commit 46141bf206dc672c3ab2b50850df702305ecb8ff [revision 123] Author: Loren Merritt Date: Sun Feb 13 09:49:42 2005 +0000 Cache half-pixel interpolated reference frames, to avoid duplicate motion compensation. 30-50% speedup at subq=5. Patch by Radek Czyz. git-svn-id: svn://svn.videolan.org/x264/trunk@123 df754926-b1dd-0310-bc7b-ec298dee348c commit d81fa19a0af848bd97b2250e1405b5fac54820b1 [revision 122] Author: Loren Merritt Date: Sat Feb 12 12:26:52 2005 +0000 In N-pass mode if stat_in and stat_out are the same file, instead save to a temp file and overwrite stat_in only when the encode finishes. git-svn-id: svn://svn.videolan.org/x264/trunk@122 df754926-b1dd-0310-bc7b-ec298dee348c commit dc270b76915975ab1ea6e16992aa79e96e6801f7 [revision 121] Author: Loren Merritt Date: Fri Feb 11 19:04:44 2005 +0000 VfW: x264_log now creates a window for error messages git-svn-id: svn://svn.videolan.org/x264/trunk@121 df754926-b1dd-0310-bc7b-ec298dee348c commit ef4d1fa4a99a23420708083c66e882d7cfd21d9f [revision 120] Author: Loren Merritt Date: Thu Feb 10 22:11:39 2005 +0000 cosmetics git-svn-id: svn://svn.videolan.org/x264/trunk@120 df754926-b1dd-0310-bc7b-ec298dee348c commit 6f298b9d889536c5dc14fc08faa95447a322a1cd [revision 119] Author: Loren Merritt Date: Thu Feb 10 21:54:40 2005 +0000 bs_align_1() didn't actually write all ones. (so encoded streams with cabac were technically invalid, though no decoder cares.) Patch by Tuukka Toivonen. git-svn-id: svn://svn.videolan.org/x264/trunk@119 df754926-b1dd-0310-bc7b-ec298dee348c commit ca4ae5219a95e05e80e707ce6828b79276e1f795 [revision 118] Author: Loren Merritt Date: Tue Feb 8 23:30:33 2005 +0000 VfW: tweak option names git-svn-id: svn://svn.videolan.org/x264/trunk@118 df754926-b1dd-0310-bc7b-ec298dee348c commit 85c92e1be46a3fa90e81121a03bfb8479e87a2da [revision 117] Author: Loren Merritt Date: Sun Feb 6 06:47:42 2005 +0000 VfW: use separate stats files for each pass of an N-pass encode. git-svn-id: svn://svn.videolan.org/x264/trunk@117 df754926-b1dd-0310-bc7b-ec298dee348c commit 70ce7a4be261f638a614918ed0e822b7f60d8269 [revision 116] Author: Loren Merritt Date: Sat Feb 5 22:55:48 2005 +0000 VfW: Enable multipass by default, increase the configurable range of I and B quant ratios. core: Tweak error messages. git-svn-id: svn://svn.videolan.org/x264/trunk@116 df754926-b1dd-0310-bc7b-ec298dee348c commit 2796ba0138a59efa32357f9dc708eefe01c55882 [revision 115] Author: Loren Merritt Date: Fri Feb 4 01:20:55 2005 +0000 r114 didn't completely fix the problem, trying again. git-svn-id: svn://svn.videolan.org/x264/trunk@115 df754926-b1dd-0310-bc7b-ec298dee348c commit 33140d0984c7415ed0441858022e777794934550 [revision 114] Author: Loren Merritt Date: Thu Feb 3 11:03:17 2005 +0000 Another MV clipping fix. git-svn-id: svn://svn.videolan.org/x264/trunk@114 df754926-b1dd-0310-bc7b-ec298dee348c commit 917924591931e63df4458eb90d5d8bce4bff035d [revision 113] Author: Loren Merritt Date: Tue Feb 1 10:13:51 2005 +0000 Simplify x264_cabac_mb_type. git-svn-id: svn://svn.videolan.org/x264/trunk@113 df754926-b1dd-0310-bc7b-ec298dee348c commit 7927c9ec8c4991040728d64893490c8ecb3d9b44 [revision 112] Author: Loren Merritt Date: Mon Jan 31 12:20:23 2005 +0000 More accurate clipping rectangle for motion search. (slight compression improvement for high-motion scenes) git-svn-id: svn://svn.videolan.org/x264/trunk@112 df754926-b1dd-0310-bc7b-ec298dee348c commit 5b750c35db61a4d26e8801c175da422d11748aad [revision 111] Author: Eric Petit Date: Fri Jan 28 15:17:51 2005 +0000 encoder/encoder.c: gcc < 3 compile fix git-svn-id: svn://svn.videolan.org/x264/trunk@111 df754926-b1dd-0310-bc7b-ec298dee348c commit 3f098e9644a31f17490f05d0ecea08e6443aa110 [revision 110] Author: Loren Merritt Date: Fri Jan 28 13:47:14 2005 +0000 Change default level from 2.1 to 4.0 until I get around to calculating actual levels. git-svn-id: svn://svn.videolan.org/x264/trunk@110 df754926-b1dd-0310-bc7b-ec298dee348c commit bccb009f9fa67797da6cd3da742ad9a27266b12b [revision 109] Author: Loren Merritt Date: Fri Jan 28 02:51:21 2005 +0000 Clipping mvs to within picture + emulated border when running motion compensation. git-svn-id: svn://svn.videolan.org/x264/trunk@109 df754926-b1dd-0310-bc7b-ec298dee348c commit c16119a2ccf2dff2328628dd9ded4681f5502c38 [revision 108] Author: Loren Merritt Date: Thu Jan 27 11:33:14 2005 +0000 Fix clipping of mvs in probe_pskip. (Previously it mixed up fullpel with qpel.) This should eliminate the black blocks that sometimes appeared in high motion, low detail scenes. git-svn-id: svn://svn.videolan.org/x264/trunk@108 df754926-b1dd-0310-bc7b-ec298dee348c commit 6558c8322f175e3970c8a2f351dd7f8f66e130d2 [revision 107] Author: Loren Merritt Date: Tue Jan 25 22:25:05 2005 +0000 Fix length of strings stored in the registry. Patch by Riccardo Stievano. git-svn-id: svn://svn.videolan.org/x264/trunk@107 df754926-b1dd-0310-bc7b-ec298dee348c commit 084175d95978011c837e4363616f5cac5794bc07 [revision 106] Author: Loren Merritt Date: Mon Jan 24 22:55:48 2005 +0000 registry values for min/max keyint were mixed up git-svn-id: svn://svn.videolan.org/x264/trunk@106 df754926-b1dd-0310-bc7b-ec298dee348c commit 79f73aa2f5a5a6d16dadbc10c7ae9647fae76a29 [revision 105] Author: Loren Merritt Date: Sun Jan 23 09:38:42 2005 +0000 VfW: expose option "Nth pass" (i.e. simultaneously read and update the multipass stats file). Patch by Riccardo Stievano. git-svn-id: svn://svn.videolan.org/x264/trunk@105 df754926-b1dd-0310-bc7b-ec298dee348c commit 0292410a8da029c45625c9f8670c9bf16c828c12 [revision 104] Author: Loren Merritt Date: Fri Jan 21 08:22:47 2005 +0000 add "make NDEBUG=1" to strip library git-svn-id: svn://svn.videolan.org/x264/trunk@104 df754926-b1dd-0310-bc7b-ec298dee348c commit 66ee02bdb20e5fe43f3dabbaa10e61d49a945a03 [revision 103] Author: Loren Merritt Date: Tue Jan 18 21:32:20 2005 +0000 finish subpixel motion refinement for B-frames (up to 6% reduced size of B-frames at subq <= 3) git-svn-id: svn://svn.videolan.org/x264/trunk@103 df754926-b1dd-0310-bc7b-ec298dee348c commit 37c0a244e49e607855a55a06b4b911eaadbc4604 [revision 102] Author: Loren Merritt Date: Tue Jan 18 12:19:39 2005 +0000 VfW: expose the 2pass ratecontrol option: qcomp ("bitrate variability"). Some rearranging of the advanced configuration dialogue. Patch by Riccardo Stievano . git-svn-id: svn://svn.videolan.org/x264/trunk@102 df754926-b1dd-0310-bc7b-ec298dee348c commit c80d310f2af65750dafc3decdab6c1df2cbbc5e3 [revision 101] Author: Loren Merritt Date: Mon Jan 17 04:29:24 2005 +0000 VfW: Support ip_factor and pb_factor, some cleanups. patch by Riccardo Stievano git-svn-id: svn://svn.videolan.org/x264/trunk@101 df754926-b1dd-0310-bc7b-ec298dee348c commit 19ed02568e95f69e2dd33f9b8d8cd1ff900f268b [revision 100] Author: Loren Merritt Date: Sat Jan 15 11:28:44 2005 +0000 Use floats instead of int64 in log messages, since win32 (incl. mingw) doesn't understand %lld. Also display MB statistics in percent instead of number. git-svn-id: svn://svn.videolan.org/x264/trunk@100 df754926-b1dd-0310-bc7b-ec298dee348c commit 4394c4d549caac00dedfeacffa14857839365f04 [revision 99] Author: Loren Merritt Date: Sat Jan 15 10:28:51 2005 +0000 finished printf -> x264_log conversion. git-svn-id: svn://svn.videolan.org/x264/trunk@99 df754926-b1dd-0310-bc7b-ec298dee348c commit 04bb83346e4c7ba29c4f6b0c5e376ebde81a899b [revision 98] Author: Loren Merritt Date: Fri Jan 14 21:38:13 2005 +0000 Don't apply keyframe boost to I-frames that are followed by another I. git-svn-id: svn://svn.videolan.org/x264/trunk@98 df754926-b1dd-0310-bc7b-ec298dee348c commit 7b46f42d142f790ecf053f952ce024b467eac762 [revision 97] Author: Loren Merritt Date: Fri Jan 14 01:04:28 2005 +0000 New VfW option: "fast 1st pass" automatically disables some partitions and reduces ME quality and number of reference frames. Removed option direct_pred=none, since it provides no benefits. Patch by Riccardo Stievano . git-svn-id: svn://svn.videolan.org/x264/trunk@97 df754926-b1dd-0310-bc7b-ec298dee348c commit 177e211333d91a06ec2df3ac87c12336812d32e6 [revision 96] Author: Loren Merritt Date: Thu Jan 13 19:47:51 2005 +0000 vfw: tweak wording and defaults git-svn-id: svn://svn.videolan.org/x264/trunk@96 df754926-b1dd-0310-bc7b-ec298dee348c commit b80ed7030d5979bfa2da92a2584078c7f844f28f [revision 95] Author: Måns Rullgård Date: Thu Jan 13 18:18:05 2005 +0000 From Riccardo Stievano : here's a patch that fixes the VfW frontend after the changes made in revision 93 (GOP size management). Default values for i_keyint_max and i_keyint_min have been set to 250 and 10, respectively. git-svn-id: svn://svn.videolan.org/x264/trunk@95 df754926-b1dd-0310-bc7b-ec298dee348c commit 0a7090477d28a3f708a7b1edb89845e10b71191d [revision 94] Author: Loren Merritt Date: Thu Jan 13 06:11:22 2005 +0000 My last change of IDR decision broke in 2pass mode. fixed by remembering which frames are IDR. Disable benchmarking, as it was very slow for some people, and we already know that all the time is spent in macroblock analysis. git-svn-id: svn://svn.videolan.org/x264/trunk@94 df754926-b1dd-0310-bc7b-ec298dee348c commit 648328088b1c4bfb3afcbc92b6711cb7b7b5e068 [revision 93] Author: Loren Merritt Date: Wed Jan 12 09:50:38 2005 +0000 Changes the mechanics of max keyframe interval: Now enforces min and max GOP sizes, and allows variable numbers of non-IDR I-frames within a GOP. git-svn-id: svn://svn.videolan.org/x264/trunk@93 df754926-b1dd-0310-bc7b-ec298dee348c commit 5b13c839df14fdd8b94724230ec2e92cba3164a1 [revision 92] Author: Loren Merritt Date: Wed Jan 12 05:23:16 2005 +0000 MinGW compatible resource.rc by Radek Czyz git-svn-id: svn://svn.videolan.org/x264/trunk@92 df754926-b1dd-0310-bc7b-ec298dee348c commit 8379464a55722872e63cb6b2120e81ed5ac80781 [revision 91] Author: Loren Merritt Date: Wed Jan 12 04:45:10 2005 +0000 strict QP offset for B-frame vs following P-frame strict QP offset for I-frame vs GOP average git-svn-id: svn://svn.videolan.org/x264/trunk@91 df754926-b1dd-0310-bc7b-ec298dee348c commit 0af83ad2ddca90ba0a4066d67513a32798395ce6 [revision 90] Author: Loren Merritt Date: Tue Jan 11 06:20:37 2005 +0000 r72 broke B-frames without intra4x4. fixed. git-svn-id: svn://svn.videolan.org/x264/trunk@90 df754926-b1dd-0310-bc7b-ec298dee348c commit 5ccb93c1a78a8dfeb1953426b89494f6f5d36fec [revision 89] Author: Loren Merritt Date: Mon Jan 10 09:29:31 2005 +0000 updated VfW interface by Radek Czyz git-svn-id: svn://svn.videolan.org/x264/trunk@89 df754926-b1dd-0310-bc7b-ec298dee348c commit fc2e7ba68bfcb5b22f510839b6f0b3da333671fd [revision 88] Author: Loren Merritt Date: Sat Jan 8 02:51:24 2005 +0000 improved mv prediction: 1-3% better compression of B-frames early termination for B-frame ref search: up to 20% faster with lots of refs. git-svn-id: svn://svn.videolan.org/x264/trunk@88 df754926-b1dd-0310-bc7b-ec298dee348c commit 7fc5f99d43b5c2498b368e6f5c1620f591bd2a45 [revision 87] Author: Loren Merritt Date: Fri Jan 7 18:45:11 2005 +0000 allow constant qp on Nth pass (e.g. for forcing frame types) git-svn-id: svn://svn.videolan.org/x264/trunk@87 df754926-b1dd-0310-bc7b-ec298dee348c commit 094266b31789338c7a6b91f96aa2fc8c1bd72f94 [revision 86] Author: Loren Merritt Date: Fri Jan 7 11:08:55 2005 +0000 disable subme=0 (the huge bitrate penalty wasn't worth the speed) renumber direct_pred git-svn-id: svn://svn.videolan.org/x264/trunk@86 df754926-b1dd-0310-bc7b-ec298dee348c commit 3a11b25d31f7eac4ccc4252b5525a6554c6d22ba [revision 85] Author: Loren Merritt Date: Wed Jan 5 09:15:35 2005 +0000 oops, last patch had some debug statements git-svn-id: svn://svn.videolan.org/x264/trunk@85 df754926-b1dd-0310-bc7b-ec298dee348c commit 0ad06e691819f2a2a05f673b50afe8b676d48f44 [revision 84] Author: Loren Merritt Date: Wed Jan 5 07:08:40 2005 +0000 fix: "x264 -A all" didn't include b8x8 types. add: "make NDEBUG=1" to strip library update TODO with B-frame status git-svn-id: svn://svn.videolan.org/x264/trunk@84 df754926-b1dd-0310-bc7b-ec298dee348c commit 53295729673187df0ab1143c08fae578b5447376 [revision 83] Author: Loren Merritt Date: Wed Jan 5 06:59:29 2005 +0000 Reorganize frame type selection: No longer produces consecutive I-frames when B-frames are enabled. Not thoroughly tested, but works for me. Fix scenecut detection when B-frames are present: Can now produce IDR, but is slower since it re-encodes more frames. This might reduce compression ratio in the presence of quick fade-ins. 2pass ratecontrol deals more gracefully with completely skipped frames. git-svn-id: svn://svn.videolan.org/x264/trunk@83 df754926-b1dd-0310-bc7b-ec298dee348c commit b1d946cd95c7b4d3ab618b9bbb0191949a49ad4c [revision 82] Author: Loren Merritt Date: Mon Jan 3 03:47:49 2005 +0000 remove Makefile.cygwin because build/cygwin/Makefile is more up to date. put correct object file names in .depend git-svn-id: svn://svn.videolan.org/x264/trunk@82 df754926-b1dd-0310-bc7b-ec298dee348c commit 456e8fdc5d5a20ad14a41e83db21b9aa5529c476 [revision 81] Author: Loren Merritt Date: Mon Jan 3 02:32:44 2005 +0000 reduce default verbosity, add option -v git-svn-id: svn://svn.videolan.org/x264/trunk@81 df754926-b1dd-0310-bc7b-ec298dee348c commit 007c6f71c2b0d2868dfc46edaf24a5c27eceab47 [revision 80] Author: Loren Merritt Date: Fri Dec 31 02:33:21 2004 +0000 remove relative include paths, to avoid conflicts with libtool git-svn-id: svn://svn.videolan.org/x264/trunk@80 df754926-b1dd-0310-bc7b-ec298dee348c commit b42bd7463a00b65f1e2d5e2c10ff374531e997f6 [revision 79] Author: Loren Merritt Date: Fri Dec 31 01:56:26 2004 +0000 rename *.asm to avoid conflicts with libtool git-svn-id: svn://svn.videolan.org/x264/trunk@79 df754926-b1dd-0310-bc7b-ec298dee348c commit dabd095c2a59ec95f83428555a329049e4ab165f [revision 78] Author: Loren Merritt Date: Thu Dec 30 23:58:06 2004 +0000 list default settings in --help git-svn-id: svn://svn.videolan.org/x264/trunk@78 df754926-b1dd-0310-bc7b-ec298dee348c commit fc1380db831cab2d90a8a116848d9263bd83b871 [revision 77] Author: Loren Merritt Date: Thu Dec 30 04:01:58 2004 +0000 replace EPZS diamond with a hexagon search pattern. early termination for multiple reference frame search (up to 1.5x faster). git-svn-id: svn://svn.videolan.org/x264/trunk@77 df754926-b1dd-0310-bc7b-ec298dee348c commit ab0c769d9813d82f9d7d6f82cce289ae2e466db8 [revision 76] Author: Loren Merritt Date: Wed Dec 29 21:18:14 2004 +0000 sps->i_num_ref_frames was set higher than necessary git-svn-id: svn://svn.videolan.org/x264/trunk@76 df754926-b1dd-0310-bc7b-ec298dee348c commit e33ed4c9fc447b7e4bc057e85f9ee83b07b714e1 [revision 75] Author: Loren Merritt Date: Wed Dec 29 12:08:50 2004 +0000 new option: --fps git-svn-id: svn://svn.videolan.org/x264/trunk@75 df754926-b1dd-0310-bc7b-ec298dee348c commit d5322b4e055d3710ea48a1d3cb336b5264a9621e [revision 74] Author: Loren Merritt Date: Wed Dec 29 10:53:03 2004 +0000 various cleanups in macroblock caching. store motion data for each reference frame (but not yet used). git-svn-id: svn://svn.videolan.org/x264/trunk@74 df754926-b1dd-0310-bc7b-ec298dee348c commit 9df283c3b29afdcb20387997b09893e450075976 [revision 73] Author: Loren Merritt Date: Tue Dec 28 10:14:19 2004 +0000 more accurate cost for psub8x8 modes. git-svn-id: svn://svn.videolan.org/x264/trunk@73 df754926-b1dd-0310-bc7b-ec298dee348c commit 4e5e3770a6b2366676232fe5f335f572f6cdefcb [revision 72] Author: Loren Merritt Date: Thu Dec 23 04:33:36 2004 +0000 implement macroblock types B_16x8, B_8x16 tweak thresholds for comparing B mb types git-svn-id: svn://svn.videolan.org/x264/trunk@72 df754926-b1dd-0310-bc7b-ec298dee348c commit efbf4ad58c26c6a609a43a2b636ce50e1272f101 [revision 71] Author: Loren Merritt Date: Wed Dec 22 21:09:45 2004 +0000 simplify x264_mb_predict_mv_direct16x16_temporal git-svn-id: svn://svn.videolan.org/x264/trunk@71 df754926-b1dd-0310-bc7b-ec298dee348c commit 457eaa93110fa95c393794d85bde7943b2d325bd [revision 70] Author: Loren Merritt Date: Wed Dec 22 20:52:13 2004 +0000 option '--frames' limits number of frames to encode. patch by Tuukka Toivonen git-svn-id: svn://svn.videolan.org/x264/trunk@70 df754926-b1dd-0310-bc7b-ec298dee348c commit dfbbcec847ddb9b9bbc4549671c6ce7533a7c098 [revision 69] Author: Loren Merritt Date: Wed Dec 22 20:29:19 2004 +0000 simplify calvc mb type git-svn-id: svn://svn.videolan.org/x264/trunk@69 df754926-b1dd-0310-bc7b-ec298dee348c commit 199ff7406b76dc1c10b756053398bf8a834bcf5c [revision 68] Author: Loren Merritt Date: Fri Dec 17 10:57:02 2004 +0000 implement macroblock types B_SKIP, B_DIRECT, B_8x8 git-svn-id: svn://svn.videolan.org/x264/trunk@68 df754926-b1dd-0310-bc7b-ec298dee348c commit b6954ba2bba2f4fc002e8be4f57d7f3b43871c33 [revision 67] Author: Loren Merritt Date: Tue Dec 14 02:04:02 2004 +0000 rename 'core/' to 'common/', which avoids conflicts with libtool git-svn-id: svn://svn.videolan.org/x264/trunk@67 df754926-b1dd-0310-bc7b-ec298dee348c commit da49faec0719a3e774177356c66a4b41ddd0b10c [revision 66] Author: Loren Merritt Date: Wed Dec 8 05:01:57 2004 +0000 cleanup stats reporting report B macroblock types report average QP git-svn-id: svn://svn.videolan.org/x264/trunk@66 df754926-b1dd-0310-bc7b-ec298dee348c commit bc0a1e9b79d725680418be2bf7cf584a739ca47b [revision 65] Author: Loren Merritt Date: Wed Dec 8 02:28:58 2004 +0000 apply ip_factor and pb_factor in constant quantiser encodes. git-svn-id: svn://svn.videolan.org/x264/trunk@65 df754926-b1dd-0310-bc7b-ec298dee348c commit 25b542c01b85c6da7bea789fc60f1f22d7281488 [revision 64] Author: Loren Merritt Date: Wed Dec 1 21:23:06 2004 +0000 save a little bit of memory git-svn-id: svn://svn.videolan.org/x264/trunk@64 df754926-b1dd-0310-bc7b-ec298dee348c commit 0f65f519a8539602e08fd87d9c21b0b3f34a80d8 [revision 63] Author: Loren Merritt Date: Mon Nov 22 07:34:17 2004 +0000 multiple hypothesis mv prediction: 1-3% improved compression, and .5-1% faster git-svn-id: svn://svn.videolan.org/x264/trunk@63 df754926-b1dd-0310-bc7b-ec298dee348c commit 2489b6a66d960d46db325a15314ac82fc9f3ed1a [revision 62] Author: Laurent Aimar Date: Thu Nov 18 12:30:27 2004 +0000 * analyse: we can do 4x4 Horizontal Up mode when LEFT is avaible. Thanks Stephen Henry for the report. git-svn-id: svn://svn.videolan.org/x264/trunk@62 df754926-b1dd-0310-bc7b-ec298dee348c commit 71f820d76b4a38b4ed73a12a8aeb93589b18527a [revision 61] Author: Loren Merritt Date: Wed Nov 17 18:40:26 2004 +0000 improved 2pass ratecontrol: ensures that I-frames have comparable quantizer to the following P-frames, and produces more consistent quality in areas of fluctuating complexity. git-svn-id: svn://svn.videolan.org/x264/trunk@61 df754926-b1dd-0310-bc7b-ec298dee348c commit 7972ae1f284b1130abb6cc4f3fe963dcb0ee48c8 [revision 60] Author: Loren Merritt Date: Fri Nov 12 07:14:24 2004 +0000 more informative error message when 2pass fails to converge git-svn-id: svn://svn.videolan.org/x264/trunk@60 df754926-b1dd-0310-bc7b-ec298dee348c commit 30c6a09063e499f47077edab206fd79992914ef7 [revision 59] Author: Måns Rullgård Date: Thu Nov 11 12:37:24 2004 +0000 #include git-svn-id: svn://svn.videolan.org/x264/trunk@59 df754926-b1dd-0310-bc7b-ec298dee348c commit a0b32b7cc24d59826f311dcd2e64945a80803dc2 [revision 58] Author: Loren Merritt Date: Thu Nov 4 09:19:34 2004 +0000 cleanup spacing of frame stats with verbose logging. git-svn-id: svn://svn.videolan.org/x264/trunk@58 df754926-b1dd-0310-bc7b-ec298dee348c commit bdf9b10610bf0141ede5812a70d889f7b557560d [revision 57] Author: Loren Merritt Date: Thu Oct 28 20:10:53 2004 +0000 typo in x264_cabac_mb_sub_b_partition (see ITU-T H.264 clause 9.3.3.1.2) git-svn-id: svn://svn.videolan.org/x264/trunk@57 df754926-b1dd-0310-bc7b-ec298dee348c commit e917887b2e16e5d00d67fa8da7cd828a456fd75d [revision 56] Author: Eric Petit Date: Wed Oct 27 19:14:24 2004 +0000 Typo git-svn-id: svn://svn.videolan.org/x264/trunk@56 df754926-b1dd-0310-bc7b-ec298dee348c commit 851989ac7c839ee2bf42c74a6fd90b5eb78f0a69 [revision 55] Author: Eric Petit Date: Wed Oct 27 19:06:47 2004 +0000 + No need to emulate memalign on OS X + Fixed Makefile for OS X (Original patch by Peter Handel) git-svn-id: svn://svn.videolan.org/x264/trunk@55 df754926-b1dd-0310-bc7b-ec298dee348c commit 57554925f4075d00708be958853d5b2e9a9f6487 [revision 54] Author: Måns Rullgård Date: Wed Oct 27 15:43:15 2004 +0000 Conditionally inits 1pass rc, only if it's enabled. This prevents a couple of irrelevant warnings from appearing in constant QP mode. (Loren Merritt ) git-svn-id: svn://svn.videolan.org/x264/trunk@54 df754926-b1dd-0310-bc7b-ec298dee348c commit c9a501a467f06277ed72ba5e222ba00b018364eb [revision 53] Author: Måns Rullgård Date: Mon Oct 25 09:40:23 2004 +0000 Oops, changing those types messed up some vprintf's. fixed. (Loren Merrit ) git-svn-id: svn://svn.videolan.org/x264/trunk@53 df754926-b1dd-0310-bc7b-ec298dee348c commit 46d0385cdd96b8d0753016bb255108a3adb3ba86 [revision 52] Author: Måns Rullgård Date: Tue Oct 19 21:35:18 2004 +0000 filesize (bits) in a 32 bit int will overflow after 250MB, screwing up 2pass ratecontrol. (patch by Loren Merritt ) git-svn-id: svn://svn.videolan.org/x264/trunk@52 df754926-b1dd-0310-bc7b-ec298dee348c commit 3f7206dc5bfb84ff4e1fb494d9ecf09e98baa937 [revision 51] Author: Måns Rullgård Date: Mon Oct 11 10:13:05 2004 +0000 fix compilation on FreeBSD (from Loren Merritt (thanks to Igla)) git-svn-id: svn://svn.videolan.org/x264/trunk@51 df754926-b1dd-0310-bc7b-ec298dee348c commit 48937c64c13904973f83495b64afdf4223f647f5 [revision 50] Author: Laurent Aimar Date: Wed Sep 29 16:05:24 2004 +0000 * ratecontrol: Patch by Loren Merritt : " This patch * calculates average QP as a float, providing slightly improved ratecontrol if the first pass was CBR. * fixes the reported QP if you set both b_stat_read and b_stat_write, allowing 3 pass encoding (or just examination of the 2nd pass's stats)." git-svn-id: svn://svn.videolan.org/x264/trunk@50 df754926-b1dd-0310-bc7b-ec298dee348c commit e3ae8a7d1a3953c0f261cba6e1b161cfc7f1b0d6 [revision 49] Author: Laurent Aimar Date: Wed Sep 29 16:02:18 2004 +0000 * all: Patch by Loren Merritt. " This patch makes scene-cut detection based on the relative cost of I-frame vs P-frame, rather than just on the number of I-blocks used. It also makes the scene-cut threshold configurable. This doesn't have a very large effect: Most scene cuts are obvious to either algorithm. But I think this way is better in some less clear cut cases, and sometimes finds a better spot for an I-frame than just waiting for the max I-frame interval." git-svn-id: svn://svn.videolan.org/x264/trunk@49 df754926-b1dd-0310-bc7b-ec298dee348c commit 79a2bb78b970e55e92bb4a74ff5f88dc6d0a6851 [revision 48] Author: Laurent Aimar Date: Wed Sep 22 07:37:43 2004 +0000 * ratecontrol: added 'b' flag to fopen. git-svn-id: svn://svn.videolan.org/x264/trunk@48 df754926-b1dd-0310-bc7b-ec298dee348c commit 48e288644ed493a77b5935e90010973a4e15faf6 [revision 47] Author: Laurent Aimar Date: Wed Sep 22 07:07:48 2004 +0000 * all: Patches by Loren Merritt: "Improved patch. Now supports subpel ME on all candidate MB types, not just on the winner. subpel_refine: (completely different scale from before) 0 => halfpel only 1 => 1 iteration of qpel on the winner (same as x264 r46) 2 => 2 iterations of qpel (about the same as my earlier patch, but faster 3 => halfpel on all MB types, qpel on the winner 4 => qpel on all 5 => more iterations benchmarks: mencoder dvd://1 -ovc x264 -x264encopts qp_constant=19:fullinter:cabac:iframe=200:psnr subpel_refine=1: PSNR Global:46.82 kb/s:1048.1 fps:17.335 subpel_refine=2: PSNR Global:46.83 kb/s:1034.4 fps:16.970 subpel_refine=3: PSNR Global:46.84 kb/s:1023.3 fps:14.770 subpel_refine=4: PSNR Global:46.87 kb/s:1010.8 fps:11.598 subpel_refine=5: PSNR Global:46.88 kb/s:1006.9 fps:10.824" And "The current code for calculating the cost of encoding which reference frame a MB is predicted from, introduces a bias towards ref0 and against P16x16. Removing this bias produces an improvement of .4% - 2% bitrate, depending on content and number of reference frames." git-svn-id: svn://svn.videolan.org/x264/trunk@47 df754926-b1dd-0310-bc7b-ec298dee348c commit f9bd35a32d8de87c67749c1628ea7693a2b83460 [revision 46] Author: Laurent Aimar Date: Sun Aug 29 12:02:50 2004 +0000 * x264: added --ipratio --pbratio in help section. git-svn-id: svn://svn.videolan.org/x264/trunk@46 df754926-b1dd-0310-bc7b-ec298dee348c commit 8c6f6fa0db394e65fa9b4272deec859c9cb67aac [revision 45] Author: Laurent Aimar Date: Sun Aug 29 11:32:34 2004 +0000 * ratecontrol: path by Loren Merritt. "Use average qp instead of last qp in the frame for 2pass rc. (Improves quality and rate accuracy if the first pass was cbr.)" git-svn-id: svn://svn.videolan.org/x264/trunk@45 df754926-b1dd-0310-bc7b-ec298dee348c commit d46df39630925245993e3ecde07adce618d3c30a [revision 44] Author: Laurent Aimar Date: Sat Aug 28 22:30:44 2004 +0000 * x264: added --quiet and --no-psnr. git-svn-id: svn://svn.videolan.org/x264/trunk@44 df754926-b1dd-0310-bc7b-ec298dee348c commit cba0cd394dfd87f82daa4b621dd9b701fca5bb9f [revision 43] Author: Laurent Aimar Date: Sat Aug 28 22:19:47 2004 +0000 * eval.c: lalala ;) git-svn-id: svn://svn.videolan.org/x264/trunk@43 df754926-b1dd-0310-bc7b-ec298dee348c commit 8a5aa764ab46fd04dd60acfa3a6641742c4b0daa [revision 42] Author: Laurent Aimar Date: Sat Aug 28 22:19:15 2004 +0000 * added Loren Merritt. git-svn-id: svn://svn.videolan.org/x264/trunk@42 df754926-b1dd-0310-bc7b-ec298dee348c commit 58b7012e219b64752168084fa028e76792b6f42a [revision 41] Author: Laurent Aimar Date: Sat Aug 28 22:16:48 2004 +0000 * all: added eval.c (I hope libx264.dsp is correct, I can't test). git-svn-id: svn://svn.videolan.org/x264/trunk@41 df754926-b1dd-0310-bc7b-ec298dee348c commit 9116d300befee74c92ff3ac9fe7625a57dafab48 [revision 40] Author: Laurent Aimar Date: Sat Aug 28 22:14:26 2004 +0000 * all: 2pass patch by Loren Merritt "Mostly borrowed from libavcodec. There is not much theoretical basis behind my choice of defaults for rc_eq, qcompress, qblur, and ip_factor." git-svn-id: svn://svn.videolan.org/x264/trunk@40 df754926-b1dd-0310-bc7b-ec298dee348c commit 67b673006c7b13a3435e89ccf0a83ebbdd23937c [revision 39] Author: Laurent Aimar Date: Sat Aug 28 19:24:08 2004 +0000 * all: first part of the 2pass patch by Loren Merritt (only the header/textures bits computed for now). git-svn-id: svn://svn.videolan.org/x264/trunk@39 df754926-b1dd-0310-bc7b-ec298dee348c commit 0a7d38ca3a65133af46a438d00ff25948a78019a [revision 38] Author: Laurent Aimar Date: Sun Aug 22 15:01:46 2004 +0000 * all: include stdarg.h (needed for x264_log) git-svn-id: svn://svn.videolan.org/x264/trunk@38 df754926-b1dd-0310-bc7b-ec298dee348c commit 63e5a5c8865f76d9fb0af10ee238fdb1bea1178c [revision 37] Author: Måns Rullgård Date: Wed Aug 18 09:28:56 2004 +0000 Use x264_log() in ratecontrol.c git-svn-id: svn://svn.videolan.org/x264/trunk@37 df754926-b1dd-0310-bc7b-ec298dee348c commit f246a87fa64a4febe10a17e983f1b656d7914625 [revision 36] Author: Laurent Aimar Date: Tue Aug 17 21:08:23 2004 +0000 * encoder/encoder.c: oops. (fixed compilation). git-svn-id: svn://svn.videolan.org/x264/trunk@36 df754926-b1dd-0310-bc7b-ec298dee348c commit dab6f065ffd7ab2ecc591ef22e0d556a3516a48f [revision 35] Author: Laurent Aimar Date: Tue Aug 17 20:39:03 2004 +0000 * all: more fprintf -> x264_log. git-svn-id: svn://svn.videolan.org/x264/trunk@35 df754926-b1dd-0310-bc7b-ec298dee348c commit f53a7ae05c5f348baf1322d6d233aba899287df0 [revision 34] Author: Laurent Aimar Date: Tue Aug 17 20:27:05 2004 +0000 * all: added a x264_param_t.analyse.b_psnr git-svn-id: svn://svn.videolan.org/x264/trunk@34 df754926-b1dd-0310-bc7b-ec298dee348c commit 72eced43c35ecce69b422c2da228ae044430c038 [revision 33] Author: Laurent Aimar Date: Tue Aug 17 20:03:46 2004 +0000 * encoder/encoder.c: kb/s with k=1000 (more consistant). Patch by Loren Merritt git-svn-id: svn://svn.videolan.org/x264/trunk@33 df754926-b1dd-0310-bc7b-ec298dee348c commit a01315f4e71b38b63ca08f9453b3409bf4f044b8 [revision 32] Author: Laurent Aimar Date: Tue Aug 17 19:56:36 2004 +0000 * all: introduced a x264_log function. It's not yet used everywhere but we should start using it :) git-svn-id: svn://svn.videolan.org/x264/trunk@32 df754926-b1dd-0310-bc7b-ec298dee348c commit 5ffe5a90e7553c20f40dcb1ae372579a941280cb [revision 31] Author: Eric Petit Date: Mon Aug 16 08:52:05 2004 +0000 OS X is missing exp2f() git-svn-id: svn://svn.videolan.org/x264/trunk@31 df754926-b1dd-0310-bc7b-ec298dee348c commit 348de7f684821ffbb5e8a3a52969986de00c89c8 [revision 30] Author: Eric Petit Date: Mon Aug 16 08:47:51 2004 +0000 Fixed warnings with PPC 64 git-svn-id: svn://svn.videolan.org/x264/trunk@30 df754926-b1dd-0310-bc7b-ec298dee348c commit 2885471d5ea17e4e84b9e2551d8c1e2049bb3d7f [revision 29] Author: Måns Rullgård Date: Fri Aug 13 13:36:14 2004 +0000 Add my svn user name. git-svn-id: svn://svn.videolan.org/x264/trunk@29 df754926-b1dd-0310-bc7b-ec298dee348c commit ed61d8ee02dabbfc7c667ee4711e5c9cd95f2032 [revision 28] Author: Måns Rullgård Date: Fri Aug 13 13:34:47 2004 +0000 Bugfix. git-svn-id: svn://svn.videolan.org/x264/trunk@28 df754926-b1dd-0310-bc7b-ec298dee348c commit 6b20e508a7f7110b4c951fbdc4a971cc5c14494d [revision 27] Author: Måns Rullgård Date: Thu Aug 12 20:52:24 2004 +0000 Include timing info in VUI. Change frame rate from float to fraction (sorry for the inconvenience). git-svn-id: svn://svn.videolan.org/x264/trunk@27 df754926-b1dd-0310-bc7b-ec298dee348c commit 2b3cd6c669b64046a54b32577946bc262360d8ae [revision 26] Author: Måns Rullgård Date: Thu Aug 12 13:07:41 2004 +0000 Add TAGS rule. git-svn-id: svn://svn.videolan.org/x264/trunk@26 df754926-b1dd-0310-bc7b-ec298dee348c commit 444615252b766c95c3dcbb11327c379101929468 [revision 25] Author: Måns Rullgård Date: Wed Aug 11 20:24:20 2004 +0000 Fixes by Loren Merritt (lorenm at u.washington.edu). git-svn-id: svn://svn.videolan.org/x264/trunk@25 df754926-b1dd-0310-bc7b-ec298dee348c commit 78292e08d9e9770b1c74766368670eeeb3f02e3e [revision 24] Author: Måns Rullgård Date: Wed Aug 11 01:02:05 2004 +0000 Get rid of integer overflows that caused the rate control to go haywire in some situations. git-svn-id: svn://svn.videolan.org/x264/trunk@24 df754926-b1dd-0310-bc7b-ec298dee348c commit 374baca15e0ce069cadd4c32633f915b6f8294b0 [revision 23] Author: Laurent Aimar Date: Mon Aug 9 00:05:22 2004 +0000 * encoder: correct range for i_idr_pic_id is 0..65535 (Not 0..65534) git-svn-id: svn://svn.videolan.org/x264/trunk@23 df754926-b1dd-0310-bc7b-ec298dee348c commit 07a0494bd1b2c1ec05866d09bf84d0564d6e5080 [revision 22] Author: Laurent Aimar Date: Sun Aug 8 21:36:41 2004 +0000 ratecontrol: patch by Loren Merritt "The new cbr mode fails to completely disable itself when encoding in constant QP mode. The per-block QPs are then randomized between QP+4 and QP-2 based on uninitialized ratecontrol parameters." git-svn-id: svn://svn.videolan.org/x264/trunk@22 df754926-b1dd-0310-bc7b-ec298dee348c commit 249259d03700724183b879cd504a172d9c2d35f6 [revision 21] Author: Laurent Aimar Date: Sun Aug 8 19:15:10 2004 +0000 * ratecontrol: patch by Måns Rullgård "This patch fixes a small bug (divide by 0 possible) in the rate control." git-svn-id: svn://svn.videolan.org/x264/trunk@21 df754926-b1dd-0310-bc7b-ec298dee348c commit e96703d73226f61149e2c815a03f8443a620ffff [revision 20] Author: Laurent Aimar Date: Sun Aug 8 16:18:49 2004 +0000 * encoder: simpler scene cut detection (seems better but do not check size anymore, so need more testing). git-svn-id: svn://svn.videolan.org/x264/trunk@20 df754926-b1dd-0310-bc7b-ec298dee348c commit 11e1b0c27fdd2213007fdb91f40d0fc2a1c11569 [revision 19] Author: Laurent Aimar Date: Sun Aug 8 14:23:50 2004 +0000 * all: Change the way PSNR is computed (based on a patch by Loren Merritt Using SQE(DeltaSourceReconstructed) = Sum( delta^2 ) PSNR( SQE, Size ) = -10Ln(SQE / 255^2 / Size )/Ln(10) ) Y+U+V : Union of YUV planes. Now there is - Mean PSNR : Sum( PSNR( SQE(Y/U/V), Size(Y/U/V) ) / TotalFrames - Average PSNR: Sum( PSNR( SQE(Y+U+V), Size(Y+U+V) ) ) / TotalFrames - Global PSNR: PSNR( Sum( SQE(Y+U+V) ), Size(Y+U+V)*TotalFrames ) Mean PSNR is used by the JM, and Average/Overall is used on Doom9 for example. git-svn-id: svn://svn.videolan.org/x264/trunk@19 df754926-b1dd-0310-bc7b-ec298dee348c commit 9168b245fb9b7e84e4c55faba839b89a4b54a48b [revision 18] Author: Laurent Aimar Date: Sat Aug 7 16:02:20 2004 +0000 * x264.h: increased X264_BUILD. git-svn-id: svn://svn.videolan.org/x264/trunk@18 df754926-b1dd-0310-bc7b-ec298dee348c commit 20c19c4b3a001c9e02775fdba040725a438db795 [revision 17] Author: Laurent Aimar Date: Fri Aug 6 18:06:09 2004 +0000 * all: Patch from Måns Rullgård "Here's a patch that adds some kind of rate control. I suppose it is by no means perfect, but it's much better than constant quantizer. It also has a very crude scene change detection that sometimes avoids a buffer underflow by reencoding oversized P/B frames as I frames." git-svn-id: svn://svn.videolan.org/x264/trunk@17 df754926-b1dd-0310-bc7b-ec298dee348c commit 30a244d00365a81999c82bdd3dd8beb0d036d36a [revision 16] Author: Eric Petit Date: Mon Aug 2 07:05:05 2004 +0000 Linux PPC AltiVec fix git-svn-id: svn://svn.videolan.org/x264/trunk@16 df754926-b1dd-0310-bc7b-ec298dee348c commit b0495a99ee0b65b03be7d6961ddb70ef7e38dcf0 [revision 15] Author: Eric Petit Date: Wed Jul 28 21:39:06 2004 +0000 BeOS fixes (no stdint.h, no libm) git-svn-id: svn://svn.videolan.org/x264/trunk@15 df754926-b1dd-0310-bc7b-ec298dee348c commit bf06e99e9b054a3f671a6f3f0c62d9b204057b0b [revision 14] Author: Eric Petit Date: Tue Jul 27 08:34:59 2004 +0000 Attempt to fix build on Linux PPC git-svn-id: svn://svn.videolan.org/x264/trunk@14 df754926-b1dd-0310-bc7b-ec298dee348c commit 86ca49033ad85660dc88f82ab721263f4d29290e [revision 13] Author: Laurent Aimar Date: Fri Jul 23 18:14:59 2004 +0000 * encoder.c, analyse.c, macroblock: fixed when using a qp per MB. (Buggy for pskip and mb with null cbp luma and chroma). * dct*: fixed order of idct. git-svn-id: svn://svn.videolan.org/x264/trunk@13 df754926-b1dd-0310-bc7b-ec298dee348c commit 55eb54c7e47a8f98f4382a953e03ee414972c36f [revision 12] Author: Laurent Aimar Date: Fri Jul 16 18:26:19 2004 +0000 * cpu.asm: mmh trashing ebp,esi and edi isn't a good idea I fear ;) git-svn-id: svn://svn.videolan.org/x264/trunk@12 df754926-b1dd-0310-bc7b-ec298dee348c commit a8703c8933f51715c25118eb83487072a548934e [revision 11] Author: Laurent Aimar Date: Tue Jun 29 22:41:42 2004 +0000 * all: fixed ss2 runtime selection. git-svn-id: svn://svn.videolan.org/x264/trunk@11 df754926-b1dd-0310-bc7b-ec298dee348c commit 02713c2401b471fb3bfa4541e8b867dfd06628cc [revision 10] Author: Min Chen Date: Fri Jun 18 02:00:40 2004 +0000 update & SSE2 support git-svn-id: svn://svn.videolan.org/x264/trunk@10 df754926-b1dd-0310-bc7b-ec298dee348c commit 77bce7d16d8a2fc12aff32997def2f966975617c [revision 9] Author: Min Chen Date: Thu Jun 17 09:01:19 2004 +0000 update git-svn-id: svn://svn.videolan.org/x264/trunk@9 df754926-b1dd-0310-bc7b-ec298dee348c commit c7631faf30ef78cc254a7ad8a7552e65824507d8 [revision 8] Author: Min Chen Date: Thu Jun 17 08:58:43 2004 +0000 remove some unused code git-svn-id: svn://svn.videolan.org/x264/trunk@8 df754926-b1dd-0310-bc7b-ec298dee348c commit 8d3d88be0fe3a22ea39861321eed015efb454359 [revision 7] Author: Min Chen Date: Mon Jun 14 05:47:51 2004 +0000 support for build checkasm.exe git-svn-id: svn://svn.videolan.org/x264/trunk@7 df754926-b1dd-0310-bc7b-ec298dee348c commit b2e2e34b3415cb9429475c1b828dcb2c36855308 [revision 6] Author: Laurent Aimar Date: Thu Jun 10 18:13:38 2004 +0000 * build fix (thx xxcd). git-svn-id: svn://svn.videolan.org/x264/trunk@6 df754926-b1dd-0310-bc7b-ec298dee348c commit 4fb5f9aa45f1a71fd718eaabd0992b8086887fa0 [revision 5] Author: VideoLAN Date: Thu Jun 10 07:32:18 2004 +0000 * TODO: test. git-svn-id: svn://svn.videolan.org/x264/trunk@5 df754926-b1dd-0310-bc7b-ec298dee348c commit a511bbecc348964c9de501a954c08f1b3bd4644d [revision 4] Author: Laurent Aimar Date: Wed Jun 9 19:35:31 2004 +0000 * vfw/* : oops... git-svn-id: svn://svn.videolan.org/x264/trunk@4 df754926-b1dd-0310-bc7b-ec298dee348c commit 166ed2dd0b3ef0a89f64371e78c70a1b8f874ddd [revision 3] Author: Laurent Aimar Date: Wed Jun 9 19:35:07 2004 +0000 * mc-c.c compilation fix for gcc >= 3.3 git-svn-id: svn://svn.videolan.org/x264/trunk@3 df754926-b1dd-0310-bc7b-ec298dee348c commit 602c87d5a3fb75cf7404bae08ecc3d5bc5ab1372 [revision 2] Author: Laurent Aimar Date: Thu Jun 3 19:29:57 2004 +0000 * all: re-import of CVS. git-svn-id: svn://svn.videolan.org/x264/trunk@2 df754926-b1dd-0310-bc7b-ec298dee348c commit 5dc0aae2f900064d1f58579929a2285ab289a436 [revision 1] Author: Laurent Aimar Date: Thu Jun 3 19:29:33 2004 +0000 * all: re-import of the CVS. git-svn-id: svn://svn.videolan.org/x264/trunk@1 df754926-b1dd-0310-bc7b-ec298dee348c