Fixing the AMD AHCI drivers for SB7xx on Windows 7

I heard a lot of urban legends about the Windows Update service that messes up your machine. Of course, I dismissed all of them with the classic “worksforme” as didn’t happen to me. Until Microsoft delivered a 3rd party driver update via an optional package. You know, like the stuff that comes from the vendor and it isn’t properly tested. I had the lack of inspiration to check that too instead of simply ignoring it, like I usually do with Bing Desktop and Silverlight. The next thing was a BSOD at boot.

Had to disable the AHCI in BIOS and revert to using IDE mode for the SATA ports. Which kinda sucks for some reasons. The most important: the SSD performance is hurt under IDE mode, the TRIM command won’t work under IDE mode without 3rd party software since only the MSAHCI driver implements TRIM from Windows 7, and the fact that my HDD array doesn’t support NCQ under IDE mode.

When it comes to drivers, AMD is still a shitty company. Even more, their engineers didn’t grasp the concept of backward compatibility. Uninstalling the driver that broke my installation and installing a driver that works proved to be a non-trivial task. Fortunately I found this post on pchelpforum.com.

For the sake of avoiding the link rot, I’m going to reproduce the essentials for posterity, with the same disclaimer as the original – you’re on your own if you mess up your machine and I’m not taking any responsibility if you follow these:

  • Delete any older version of the amd_ahci driver from here: C:\Windows\System32\DriverStore\FileRepository. The folders with older AMD AHCI drivers are named something like: amd_sata.inf_amd64_neutral_c85cc6046149a413 (i386 on 32-bit and most probably another hash). In order to remove the directory, you need to either elevate your explorer / shell to SYSTEM privileges, or take the ownership of the driver directory, add proper permissions, then delete it.
  • From HKLM/SYSTEM/CurrentControlSet/services delete amd_sata and amd_xsata. There’s no need to remove the entries without the underscore (amdsata and amdxsata).
  • Reboot the computer. Don’t change from IDE to AHCI. The driver that actually worked for my combination, which is AMD 780G / SB700 is this one. Execute the installer, wait till it finishes to copy the files to C:\ATI\Support, then cancel the setup when the Catalyst installer starts.
  • Open the Device Manager. Action » Add legacy hardware » Advanced mode » Show All Devices » Have Disk. Browse the extraction path for the above package: C:\ATI\Support\11-12_vista32-64_ahci\Packages\Drivers\SBDrv\SB7xx\AHCI. There’s a couple of directories: LH – for 32-bit and LH64A – for 64-bit. Select “AMD SATA Controller” then continue. Unlike the author of the original material, I didn’t get an error about the device not starting.
  • Reboot the computer. Don’t change from IDE to AHCI. Go to Device Manager. Under IDE ATA/ATAPI controllers should be at least an entry with a yellow exclamation mark, AMD SATA Controller. Uninstall “AMD SATA Controller” without checking “Delete the driver software for this device”. Reboot the machine.
  • Go to BIOS, enable AHCI. After boot, the OS installs the proper drivers, then prompts for another reboot. Reboot the machine. Done.

In my case, it simply fixed the driver installation from the failed Windows update as the driver that runs on my machine is from 2013 and the driver used in the above steps is from 2011. The drivers from the latest Catalyst, 13.4 failed to install via the “Add legacy hardware” method or via a standard Catalyst setup.

amd-sata-controller

Some benchmarks with a SSD drive under IDE mode:

benchmark1-idebenchmark2-ide

And some benchmarks under AHCI mode:

benchmark1-ahcibenchmark2-ahci

I guess the sharp drop was due to TRIM doing its job. Yes, it’s enabled:

trim

Splitting a string every nth char in shell

I needed some reusable stuff that splits a string every nth char. Then I remembered that bash and zsh, the shells that I usually use, support string slicing. Kinda like Python does. Or the other way around. Made a shell function. Dropped in into .bashrc / .zshrc. Enjoy.

function string_split()
{
	str="$1"
	count=$2
	while [ ! -z "$str" ]
	do
		echo "${str:0:$count}"
		str="${str:$count}"
	done
}

Example:

string_split abcd 2
ab
cd

git is distributed, stupid

There’s no news that pretty often the popular code hosting services, like GitHub or Bitbucket, go down when you least expect it. Especially GitHub, or so it seems. From time to time I get into my feed reader yet another entry from Hacker News that “GitHub is down”.

However, being hit by this problem, I managed to work around it by simply using stuff that’s already part of git itself. No need for going in panic mode for every GitHub hiccup. git is distributed, stupid.

I’ve seen a lot of solutions or proposals, but none of them were KISS compliant (or should I say: blog post title compliant), like using a different remote for pushing to a secondary service, or using hooks. Found out that git supports multiple url entries per remote, but the functionality isn’t exposed into the interface itself. You need to actually edit the config file.

Fortunately, git exposes a config edit shortcut: “git config -e” which opens the repository configuration file “.git/config” with the default editor. Found out that “git config -e” is easier to remember, but YMMV.

A real world example from one of my projects:

[remote "origin"]
        fetch = +refs/heads/*:refs/remotes/origin/*
        url = [email protected]:SaltwaterC/aws2js.git
        url = [email protected]:SaltwaterC/aws2js.git

This means that every time I issue a “git push [–tags] [remote branch]” everything is automatically synced in multiple remote repositories, removing the single point of failure.

The ordering of the url entries is important as only the first is use for pulling the changes. If a specific url fails to accept the changes, then the rest of the url entries are ignored. Sure, some things may go out of sync for a while, but “eventually consistent” is the term you’re looking for in this scenario. You may pull changes between team members, but that’s not always applicable, therefore it doesn’t hurt to have some failover option.

I found out that Bitbucket is a little bit more stable that GitHub. It defaults to that. Used to be the other way though.

Use the cache, Luke, Part 2: don’t put all your eggs into the memcached buck … basket

This is the second part of a series called: Use the cache, Luke. If you missed the first part, here it is: From memcached to Membase memcached buckets. Meanwhile, the AWS ElastiCache service proved to have better network latency than our own rolled out Membase setup, therefore the migration was easily done by simply switching the memcached config. No vendor lock in.

However, it took me a while to write this second part.

If you can see this, then you might need a Flash Player upgrade or you need to install Flash Player if it’s missing. Get Flash Player from Adobe. This error may appear if the URL path to the embedded object is broken or you have connectivity issue to the embedded object. Powered BY XVE Various Embed.

Please have a look at the above video. Besides the general common sense guidelines about how to scale your stuff, and the Postgres typical stuff, there’s a general rule: cache, cache, and then cache some more.

However, too much caching in memcache (whatever implementation) may kill the application at some point. The application may not be database dependent, but it is cache dependent. Anything that affects the cache may have the effect of a sledgehammer on your database. Of couse, you can always scale vertically that DB instance, scale horizontally by adding read-only replicas, but the not-so-fun part is that it costs a lot just to have the provisioned resources in order to survive a cache failure.

The second option is to have a short lived failover cache on the application server. Something like five minutes, while the distributed cache from memcache may last for hours. Enough to keep the database from being hit from live traffic, while you don’t have to provision a really large database instance. Of course, it won’t work with stuff that needs some “real time” junk, but it works with data that doesn’t change with each request.

There are a lot of options for a failover cache since there’s no distributed setup to think about. It may be a memcached daemon, something like PHP’s APC API, or, the fastest option: the file based caching. Now you may think that I’m insane, but memcached still has the IPC penalty, especially for TCP communication, while if you’re a PHP user, APC doesn’t perform as expected.

I say file based caching, not disk based caching, as the kernel does a pretty good job at “eating your RAM” with the disk caching stuff. It takes more to implement it since the cache management logic must be implemented into the application itself, you don’t have stuff like LRU, expiration, etc. by default, but for failover reasons, it is good enough to worth the effort. In fact, it ran for a few days on the failover cache without any measurable impact.

The next part for not using the same basket for all of your eggs is: cache everywhere you can. For example, by using the nginx FastCGI cache, we could shave off 40% of our CPU load. Nothing experimental about this last part. It is production for the last 18 months. If you get it right, then it could be a really valuable addition to a web stack. However, a lot of testing is required before pushing the changes to production. We hit a lot of weird bugs for edge cases. The rule of thumb is: if you get the cache key right, then most of the issues are gone before going live.

In fact, by adding the cache control stuff from the application itself, we could push relatively shortly lived pages to the CDN edges, shaving off a lot of latency for repeated requests as there’s no round trip from the hosting data center to the CDN edge. Yes, it’s the latency, stupid. The dynamic acceleration that CDNs provide is nice. Leveraging the HTTP caching capabilities is nicer. Having the application in a data center closer to the client is desirable, but unless your target market is more distributed than having a bunch of machines into the same geo location, it doesn’t make any sense to deploy into a new data center which adds its fair share of complexity when scaling the data layer.

Reverse dependencies for the installed packages in Debian + friends

Some libraries are more libraries than others. It is one of those moments when you ask yourself if migrating to a newer version of a library fucks up the entire system. But you need that foo library as it implements feature bar. In my case, I wanted libpcre3 8.20+ in order to enable PCRE JIT. Though luck. Not even Debian sid packages 8.20.

Now I know that there’s apt-cache rdepends, but it lists all the reverse dependencies of a specific package. I needed just the reverse dependencies of the installed packages. With a little bash-fu, here it goes:

#!/bin/bash
 
function package_rdepends
{
	for package in $(apt-cache rdepends $1 | grep -Ev "^$1$" | grep -v 'Reverse Depends:')
	do
		apt-cache policy $package | grep 'Installed: (none)' > /dev/null 2>&1
		if [ $? -eq 1 ]
		then
			echo $package
		fi
	done
}
 
package_rdepends $1 | sort -u

Saved as installed-rdepends. Made executable.

./installed-rdepends libpcre3
grep
libglib2.0-0
libglib2.0-dev
libpcre3-dev
libpcrecpp0

The above script may be slow for packages with many reverse dependencies due to the fact that each package has an individual lookup. Didn’t have the patience to measure the time it takes to do a lookup for libc6. Some benchmarks for the package lookup:

time apt-cache policy libpcre3 | grep 'Installed: (none)' > /dev/null 2>&1
 
real	0m0.006s
user	0m0.005s
sys	0m0.003s
 
time dpkg -L libpcre3 > /dev/null 2>&1
 
real	0m0.017s
user	0m0.012s
sys	0m0.005s
 
time dpkg -l libpcre3 > /dev/null 2>&1
 
real	0m0.667s
user	0m0.600s
sys	0m0.067s
 
time dpkg -s libpcre3 > /dev/null 2>&1
 
real	0m0.587s
user	0m0.533s
sys	0m0.054s
 
time cat /var/lib/dpkg/available | grep -E "Package: libpcre3$" > /dev/null 2>&1
 
real	0m0.034s
user	0m0.015s
sys	0m0.048s

However, I didn’t try these results on a bare metal installation.