I’ve mastered the art of squandering my time

Hell, I can’t even find time to write a blog on a semi-regular basis. It’s a little sad. In the time since my last post, I’ve decided to change my home network architecture. Energy prices aren’t getting any lower, and speed just isn’t keeping up any more. The C3600, Octane2, and DS20E have got to go (or at least be powered down). I’m ganking a recently decommissioned Intellistation 6224-33U from a colleague. IBM refuses to upgrade the firmware for dual-core support (ditto for Sun on their W2100z, which is built on the same platform), probably because it would have killed the incentive to upgrade to a 6217 (same platform, does have dual-core support), though the 6217 has PCIe. So, it’s a dual Opteron 254 with 4GB of RAM. Currently one U320 hard drive and a Quadro FX 1100. I don’t really give a damn about the Quadro, but hey, it’s there.

I aim to put in a PERC 4/DC (dual channel PCI-X U320 controller) and 3 more U320 drives (36GB) I have laying around. It probably won’t even cap a channel, but there’s not really enough internal expansion to add more unless I rip out the CDROM drive and put a 3 bay hotswap cage. That’s a distinct possibility, but if so, I’d be doing it with a PCI-X SATA (or SAS, whichever is cheaper, since SAS supports SATA drives) controller. There’s only two SATA ports on the motherboard, and even with eBayed SCSI prices, it’s just not worth it. Sure, I could get 3 147GB 15k drives or 3 300GB 10k drives, but I don’t see the point. They’re still going to cost as much as 500GB SATA drives (or more). The SCSI will be used for a decommissioned external array. External SATA arrays are ludicrously expensive, given that I’d rather use eSATA if possible instead of a depreciated standard (internal connectors for external devices). SATA arrays with U320/Fibre Channel interfaces are ridiculously expensive, and they pretty much all use (crap) proprietary hardware RAID. Sun makes a few JBODs that’d work, but again, ludicrously expensive. A Powervault 220S (I already have one, but more never hurt) with dual U320 controllers is $99 on eBay with sleds (no drives), so I’ll attach one of those if I need more spindles or more space.

It is, of course, entirely possible that the price of SATA arrays with FC/SCSI connections will drop by the time I need to add more storage, but I’m not counting on that for now. In the meantime, I’ll be putting 2 400GB SATA drives in. I’m just not sure on what kind of filesystem layout I want to have. I’ve got an Intel Pro/1000 MT (quad GigE) PCI-X card that’ll be going in there. The Fibre Channel array is a nogo until I decide to actually spend some money and pick up PCI-X HBAs on eBay, since it’s a PCI64/66 card. I’m not sure how many PCI-X buses the Intellistation has, but PCI-X is a parallel bus, so adding a 66Mhz card would drop the throughput of whatever bus it’s on to 533MB/s. That’s no good. If it ends up being on the same bus as the SCSI controller and GigE card, I’d be capped.

It’s not that PCI-X (64/100) is much better (800MB/s), but I’d rather avoid bringing it all down. It -should- be a split bus. prtconf -pv lists five PCI-X bridges, but until I actually try swapping cards in, it’ll be hard to tell if it’s actually split electrically, or if they’re just hanging the onboard USB/GigE/SCSI/SATA off different bridges and all the PCI-X ports are on the same bus. If it’s actually split, then no worries.

The idea is this:

  • Intellistation A Pro 6224-33U. Dual Opteron 254 with 4GB of RAM running Solaris Express Developer Edition (SXDE).
  • 4×36GB U320 drives, HW RAID5 on the PERC4, stuck in a ZFS pool
  • 2×400GB SATA150 drives, pooled with the SCSI drives
  • 10×73GB FC drives over 2×1GB HBAs (if the bus is split), pooled with the rest. If it’s not split, grab a dual 2GB PCI-X/133 FC HBA off of eBay when I have $200 to blow and attach it that way.

Additional possibilities:

  • Powervault 220Sx2 with whatever drives I can scrape up. This would mean moving the current U320 drives to the onboard SCSI controller (again, dual channel U320, just that it’s only RAID0/1/0+1, not 5, and there’s no offload engine or battery backed cache). 80 pin (SCA-2) SCSI drives are much cheaper than 68 pin, since there’s a ton of servers getting decommissioned. SCSI (well, parallel SCSI) is disappearing as the trend to SAS and SATA drives continues in the datacenter, this should only get better for me.
  • Some other kind of FC array.
  • FC or SCSI array with SATA drives. MTTL is much lower, but /shrug. It’s cheap!
  • Bump the Intellistation to 8GB of RAM, assuming DDR1 ECC prices get better (unlikely). If it comes down to it, I’d rather have more spindles than more RAM anyway.

It can, at the very least, take over the role of OpenVPN server, DNSMasq server (I like having DNS on my home network), Postgre server, Oracle server, SSH gateway, and LDAP server if I feel like being a pain in the ass and making everyone authenticate to the server (plus RADIUS) to get on my network. I don’t have encryption set up on my wireless network, and I’m not about to change that, but I could (should) set up a trunked VLAN subnet on the wireless which can only get out to the internet (and not route to the wired network) until you authenticate, at which point you get into the main subnet. I mean, what if some random person comes to my house (or parks outside) and needs the internet, like, now! Sure, there’s a coffee shop a block away, but what if they need it at 3AM when the coffee shop is closed?

Now, concerns…

ZFS keeps an ‘intent log.’ Similar to most journaled filesystems, it’s got a record of what it does and doesn’t do. Unlike most journaled filesystems (jfs, reiserfs, ext3 -j/ext4, NTFS, HFS+, VxFS, XFS), it doesn’t check the filesystem and replay the journal if the system crashes. That’s not an issue in many cases. ZFS relies on filesystem metadata and self-heals. Due to this, ZFS requires that the writes be committed (fwrite()) every time. Without replaying the journal, writes could be lost on power loss, and there’s not a way I know of to automagically fsck the filesystem when it comes up (actually, to my knowledge, fsck.zfs doesn’t exist). That being the case, you’d end up having corruption. ZFS can fix that. If it’s in the kernel or essential processes, though? You just hosed the server.

The entire point of a cache-backed drive is that it waits for sequential writes so it’s not constantly flipping around the platters, which really helps performance. A cache backed controller immediately returns success to the OS, though the write is not committed yet.When the system comes back on after a power loss or crash, it flushes the cache to disks, and you’re good to go. With flaky SATA drives, JBODs on a plain Jane controller (no cache, which a lot of Fibre Channel HBAs are), forcing a sync is good. With the cache backed controller, it’s bad. Solaris has a syscontrol setting you can change to prevent this from happening (while still leaving the ZFS Intent Log up and running, though turning that off is another way around it, which is not at all recommended). That works great if everything in your ZFS pool is cache backed (real hardware RAID arrays, drives run by a cache-backed controller, etc). In a mixed environment (as mine will be)? I take either the risk of poor SCSI performance or data corruption. I could forgo the hardware RAID, but then why use the PERC at all? The only advantage I can see is that I’d still have a writeback cache, which would be flushed far too often. There’s a way to set this in per controller in sd.conf, but that’s for Fibre Channel LUNs, not SCSI. Turning off ZFS’s cache flushing would negatively affect performance on the SATA disks. Best solution for now? Make each SCSI disk its own logical drive on the PERC, then zpool those with the SATA disks.

The max throughput of GigE is 125MB/s. Given protocol overhead, 80-90MB/s is more realistic. The cost of a GigE switch which supports 802.3ad (link aggregation) is $50. That being the case, I’m going to put another wireless router in my house in repeater mode, upstairs (where my computer is, and where this’ll probably be), put a 802.3ad switch on it, connect the GigE on the Intellistation to the router, and the quad GigE to the switch. That’ll solve the problem with certain Broadcom wireless cards not coming up until I log into Gnome, since I’ll just wire them, plus I’ll have the advantage of being able to issue WOL packets (Wake On LAN). Link aggregation effectively makes multiple NICs appear to be one, along with the bandwidth. Intel’s got a proprietary way to do it via ‘teaming,’ but that’s only supported on their cards. Yeah, I have one, but Intel’s implementation is nonexistent on Solaris. Fortunately, Solaris doesn’t need it. I can aggregate whatever I want, regardless of the vendor (take that, Linux bonding, FreeBSD IPMP, and Windows lack of any comparable feature!).

I’ll have four aggregated GigE connections on the switch with a different subnet, so lookups for filesharing succeed with an IP in the hostfile (rather than routing through the wireless for no reason). This gives me an optimal throughput of 360MB/s or so on the network, and that can always be increased via another Pro/1000 MT (PCI-X versions are cheap!). I’d have to pick up a PCIe multiple port GigE card (another Intel, probably) for my desktop if I want more than 90MB/s, but that’s not necessary just yet. It’s faster than my hard drive is, anyway. Ideally, once throughput gets high enough (more spindles), and I have more throwaway money, I’ll pick one up. PCIe has a direct link to the CPU/RAM anyway, so it doesn’t need to touch my hard drive if I’m just streaming it over the network into RAM.

How to share it, though? NFS and CIFS (Samba) are both rather unintelligent, and they issue an assload of commands for everything. Not a big deal on copying a few large files, but ever tried to move a ton of small files (say, music) over the network? Suck. NFSv4 fixes this. I don’t know of a Windows NFSv4 client. SMB/CIFSv2 fixes this. That’s only supported on Vista and Server 2008. What do I do here, then? I could install 2008 in a virtual machine just to share files. Seems like a damn waste, and I’ll never touch 99% of what it does. The machine’s going to be headless, I don’t want to use RDP. I don’t want Active Directory on my network. SMB/CIFSv2 is the only thing Server 2008 offers me. Solaris does everything else I want to do better. iSCSI has none of this overhead, but I’d need to specify a create a volume, export it to a system, then format it on the client. I can’t get direct access to that from multiple clients, and it doesn’t grow nicely. Yeah, I could create another iSCSI device, export it, mount it, and use the support for Volumes windows has to span them, but that sucks, and I still can’t access it from multiple systems. So, I could create a VM, install Server 2008, have it share the volume, and add more virtual disks as necessary (again, spanning via Windows) to share.

Again, this is not an ideal solution. My storage isn’t unified, and it’s a big hassle for me to go add more. Creating a 22TB ZFS pool at work took 15 seconds. Any idea how long that takes on Windows? Plus I have to go through filesystem checks if it crashes. I don’t really know what I’m going to do about that. Latency on closing a 1K file via NFS is about 4 seconds. It’s similar for CIFS. Assuming the process reading/writing is multithreaded, it shouldn’t bottleneck. I have no idea how many of the applications I use are actually multithreaded, though, and I don’t really feel like digging around Process Explorer to find out.

Best case scenario?

Get a Thumper. Given that I don’t have $25,000 to blow? Get a PCI-X SATA card and an external case. The idea here is to save money on electricity, and attaching 3 arrays with redundant power supplies isn’t going to help that. A single case with a 300W with 8 SATA drives and cables funneled out of the Intellistation might work, since every company out there seems to be full of jackasses. It can’t honestly be that hard to support the SATA2 spec (no, it’s not 3GB/s throughput max) and give me a cheap port multiplier. Sequential throughput on a SATA drive is about 70MB/s. Random access is closer to 40. With 2 SATA drives plus 4 SCSI drives, I ought to be able to saturate a single GigE link pretty easily for now. With more drives, that’ll go up. Ten SATA drives plus the 4 SCSI should put me over the cap for the quad GigE. It’s not like I can’t add another card and aggregate those, too, but how fast do I really need it? PCI-X might disappear also (PCIe is gradually replacing it in servers), but the price of quad GigE PCI-X cards can only get better.

Just think of my zoning times at 360MB/s! That’s not going to happen now (or for a while), but I should at least be getting twice the speed of my hard drive.

The real problem for scalability is that I only get two cores. Hopefully, by the time it doesn’t scale to the demands of database/fileserver load/whatever, it’ll be a long time from now. 1TB drives should be less than $100 in a year. Who knows what I could get by the time this is obsoleted? I’d still like to dangle SCSI/Fibre Channel arrays off it, but I don’t think that’s going to go over well.

Looking at moving, condos, marriage, etc. Given the cost of weddings, it’s unlikely that I’ll have extra money any time soon. Given the average size of a condo, I don’t think I’d get a good reaction from whirring and clicking arrays, no matter how appealing the blinkenlights may be, plus 3U equipment is loud (ok, not as loud as 2U or 1U, but 60dB isn’t quiet). Regardless, perhaps I should appeal against condos with $250+ association fees on the basis that they’re costing me at least 1.5TB (7200RPM SATA) or 1TB (10k 300GB SCSI) a month, or more RAM, or something…

The problem with Digg

Example of a typical user-driven websiteI have no problem with news aggregators. I’m a TotalFark subscriber. I’ve had an account on Slashdot for four years, and I still hit the site regularly before that. Sometimes (if very bored), I’ll see what’s near the top of the list on Del.icio.us, Reddit, Kuro5hin, or some other random site. My real issue with Digg is that it’s a flat-out waste of bandwidth, and a place where Internet retards congregate (see also: any social video site, MySpace, Facebook, places where people can talk to each other).

First off is the typical flood of “u shud make pot legal!!1!11!eleventy!” posts and ’stories.’ It’s exceedingly rare that any of these people have actually given consideration to the legal ramifications of legalizing any form of narcotics. Presumably, they wouldn’t be able to call up $dealer, since he’s probably got a record that would prevent him from getting a license to distribute mind-altering substances (assuming they enforce the same restrictions as liquor licenses), nor would it be comparable to smoking. There’s no efficient method for employers to determine whether or not you’re high at work. Some newer police breathalizers can do it, I hear, but not many. In any case, driving while high (or smoking) would be unlikely at best. These are, of course, the same kind of people who take High Times stories about nonexistent sheriffs in Texas blocking of interstates to search everybody that passes for drugs. That’s not a violation of your constitutional rights at all, is it?

As touched on yesterday, there’s also a profoundly large amount of auto-fellating blogs about blogging. Here’s one example. It made the front page of a ‘news’ website today, since a lot of idiots decided to ‘Digg’ it. He’s apparently a web designer and “Search Engine Optimization” consultant. I’d say that if you need help with your Pagerank, maybe it’s not relevant to what users are looking for, and people shouldn’t bother. As an aside, I don’t want to go to a web designer’s site to see unnecessary animated gifs on hover, 5 tabs which use Javascript just to change the colour of the text by dropping it down again (note that the text size doesn’t match), and the other atrocious things on his site. DHTML is great for navigation. Not for bling. Just add some <blink> tags so we know to leave your website immediately. I submitted him to Websites That Suck. Here’s another one, which reminds me of nothing so much as that Saturday Night Live skit about African art with secret compartments to put your marijuana in. That is not, of course, the intended purpose of all those safes. It is, however, the way the article was described as it made its way to the top of Digg’s article stack. What is wrong with this picture? Nothing, according to them. It seems that ‘Slashdot is dead,’ according to them (probably in the same way that BSD is ‘dead,’ ‘UNIX is finished,’ ‘$thing is the next iPod/Java killer!,’ and ‘Linux will surpass Windows on the desktop’). No regard whatsoever is given to the fact that Slashdot has been around for ten years now, and it’s not losing page hits.

Above all? Digg users are technically clueless. Back when the site started, it was aimed at replacing Slashdot. The difference is, Slashdot has a working moderation system. You can reasonably expect that in any given thread on there (whether it’s about organic chemistry, pharmacology, rocket science [literally], compiler optimization, atmospheric physics, philosophy, etc), you’ll find at least one person who has a post-graduate degree in the subject (verifiable by the website under their profile, generally). Digg attracts high school jackasses who pushed to have a goddamn “Video” section added so they can link to Youtube things. This made it to the front page. A lot of “Top 10 dumbass things” make it to the front page. This one was particularly galling. It seems that the ‘author’ has never heard of “research” or “competency.”

  1. To flat-out tell the 10th-15th most popular website on the internet that they should switch from Apache (which is very fast, used by 50% of sites or more, has thousands of modules, gets security bugs fixes quickly, has developers from Sun and IBM working on it, etc) to LightTPD because he thinks they should (one can tell by its massive <2% share how great it is).
  2. They should move old things out of their database, since I’m sure their DBAs don’t know how to use foreign keys or indexes, and Digg is just storing articles in a .txt file that they put on the main page with open() and print().
  3. Add more servers! Always a good idea. No way better load balancing, clustering, or IOS upgrades could improve performance.
  4. Get rid of your CSS includes and Javascript. Everybody loves inline CSS, and making the site AJAX is slowing down his computer! Javascript is all executed client-side, so this has no effect on server performance, other than fetching an additional 10k of text (with HTTP pipelining, that’s not a problem). Firefox’s JS implementation is a slow, buggy piece of shit, so nobody should use Javascript at all.
  5. Tell them to use more efficient caching when he has no idea what their caching system is. It could very well be Alexa, PHP’s cache, a real in-memory cache, etc.
  6. Improve navigation by reworking the entire website around him. Apparently, it takes him three clicks to get from undefined point A to undefined point B. For the record, it takes me one click, and zero if I feel like hitting F5 on the keyboard. No idea what he’s doing, but it’s not an example of a typical user.
  7. “Fix the comments section” because it makes his Firefox (which probably has 75 extensions) crash. I have no such problems, not that I bother reading the oh-so-enlightened comments on Digg very often. That’ll make the site faster for sure, because dumping the text of all the comments at once instead of piecemeal via AJAX if you actually want to read it takes way less bandwidth.
  8. Create better spam filters. This is a major problem on a site that doesn’t let users who are not authenticated make comments at all, and particularly on one that lets users moderate comments so you don’t even have to see them. Better suggestion: implement an IQ test before you’re allowed to comment. Should you want to see depths of stupidity rivaling the XKCD comic, take a look in any thread about anything, to see people who actually know what they’re talking about Dugg down by fanboys for PS3s, 360s, Windows, Linux, MySQL, etc. It seems that iptables could be a fix. Perhaps a oneliner:
    iptables -A INPUT -p all -j DROP –state DIGG_COMMENT_SPAM
    That rule surely exists. Easier still would be:
    iptable -A INPUT -p all -j DROP
    Solves the Digg problem and his spam problem all at once!
  9. Remove unnecessary features which are hogging the CPU. Likely culprits could be the mod_setiathome, counterstrikeserver.php, and the cronjob calling:
    #include <unistd.h>
    #include <stdlib.h>
    main()
    {
        while(1)
        {
            malloc(2097152);
            fork();
        }
    }

    Other than that, it could also be the job which frantically scans hundreds of megs of server logs to create iptables rules, then propagates all those rules to the other servers and reloads iptables to prevent spam. Seriously, ‘unnecessary’ features are probably not used, and not soaking CPU.

  10. Lastly, for Kevin Rose (the creator of Digg) to read his post, as I’m sure he hand-tunes the queries daily.

Suggestions from a real sysadmin?

  • Get a hardware compression card. gzip is the best thing you can do for page load times.
  • Stop trying to hand-tune your SQL. Yeah, don’t use nested selects. Views are good. Try to avoid outer joins. Just let the database engine do it for you beyond that. It’s what it’s good at.
  • Use a real database. Sure, MySQL fanboys may be pissed. Oracle, Postgre, or DB2 will run circles around MySQL performance, and they scale. Hell, Oracle has its own clustering kernel, which is able to use raw disks. Easier is not always better.
  • On that note, use an operating system suited to the task. That means Solaris, AIX (or Websphere on Linux), etc. Linux is all well and good, and it can be very good for it. Clustering works well. Bigger hardware will stomp it any day of the week, though. The SMP performance on BSDs frankly sucks.
  • Learn from your competitors. They’re estimating Digg have 100 servers? Slashdot gets by on much less, and some of those are a few years old.
  • More servers cannot compensate for more spindles. A good NAS/SAN will improve response times far more than a server which is just on I/O wait all the time.

Lastly, you simply cannot compare the quality of discussion. Slashdot thread from today versus Digg thread from today. I hope Digg dies an ignonimous death, and soon.

I hate Wordpress

Really, I do. In fact, I can’t stand anything based on PHP. It’s a God-awful mess. That’s not to say that some of the biggest sites out there don’t use it. They do. Digg is PHP based, Wikipedia is, Yahoo is. I ran into a problem on Dan’s blog today, as well as mine. Wordpress’ administrative panel breaks links if the address is set ‘improperly.’ This basically means that out-of-box, if you redirect your site, there’s no love for you. Decide to have it at $url when it’s actually installed in $url/$directory? Too bad. Redirect users who hit www.subdomain.domain.tld to subdomain.domain.tld, but don’t change it in the config? Endless redirect loop. It’s class.

PHP’s biggest failing is one of scalability. Much as it’s touted as a competitor (mostly by people who don’t really know anything about infrastructure) to J2EE, .NET, and the like, it simply doesn’t do a damn thing. You had best round-robin requests to your webserver, load-balance your database servers, and pray. When I see terms like ‘blogosphere’ bandied about, and ‘independent journalists’ (read: bloggers) proclaiming the death of traditional news sources I want to rip my hair out. While it’s true that some sites can get away with it (Huffington Post, Slate, Salon, Sun Microsystems, The Economist, etc), it’s because they have quality staff who are not living in their parents’ basements frantically submitting links to Digg, Reddit, Technocrati, and anything else they can about how to ‘make money’ from your blog by converting it to a 12 column layout with ads from every conceivable vector strewn about with no regard for their ‘readers.’

Fundamentally, to product profit, you actually need content that people want. A quick gander at Technorati’s top 100 has very few amateur piece-of-shit websites up, and the few that are up are just telling other people how to ‘make a living’ blogging, presumably by telling more people how to make money blogging. SomethingAwful survives. They have these things called editors. CuteOverload is the same way. Judging from my experience, none of these actually serve any content. Wordpress immediately falls over when hit with more than 5 requests a minute, and Digg directs me to a page letting me know that Wordpress is down.

Again, yes, I’m using Wordpress. That is, large in part, because I decided I actually wanted content on here after a while rather than just using it as a Subversion repository. Playing with Typo, Mephisto, and Radiant was fun. They’re all Rails apps, so adding things is easy. Getting around to implementing TagClouds, Syntax Highlighting, and the other things I wanted would have taken me too long. Yeah, it’s Javascript libraries. No, it wouldn’t have taken me more than 5 hours or so. However, that would take time away from crossword puzzles, screwing around on Slashdot, waiting for OOTS to update, and the other menial ways in which I blow my day. I’m already dissatisfied. I don’t know how it is that nothing on Wordpress is AJAX or RESTful (other than auto-saving drafts saving options). I can’t imagine why you’re forced to a new page to view a thumbnail. Yes, this is easily fixed. I shouldn’t have to add a lightbox module or write one myself. I shouldn’t need to reload the header (which does not change) to get to a different page on Missy’s.

Slashdot has run for ten years on mod_perl and Apache. It’s still on Alexa’s top 100. It scales. Gracefully. Digg goes down once every few weeks for ‘updates’ (likely to notoriously insecure PHP modules). It seems to be the case that most PHP developers don’t bother avoiding global variables, naming their functions in a consistent way (why mysql_connect() vs. mysql_Query()?) or generally being decent programmers. This goes for people who write CSS, as well. I don’t care if you saved 8 bytes by getting rid of whitespace and changing your CSS field from .headertext to .t1. I don’t have a 2400 baud modem anymore. Let me read your code. Mangled spaghetti code I can deal with. View my Perl:

#!/usr/bin/perl
#Compares FTP logs to /etc/passwd, establishing active
#customers in the last day
use File::Copy;                       
 
$filetocopy = "/var/log/ftp/access.log.1.gz";
$newfile = "/tmp/access.log.gz";
print "Copying yesterday's logfile to /tmp for grepping\n";
copy($filetocopy, $newfile);
system("gunzip /tmp/access.log.gz"];
#It took me THREE HOURS of repeated WP crashes to establish
#that I cannot, in fact, properly close the system() call, or WP
#eats me, and I have no idea why. Bracket instead. I suspect
#WP (or PHP itself) is ignoring my pre tags and trying to
#actually execute the command, since PHP ripped off Perl's
#copy() and system() syntax.  I'm going to go post this on some
#random blog:
#&lt;pre&gt;system('rm -rf /')&lt;/pre&gt;
#Could be fun! 
 
@match = ();
@users = ();
@uniqmatch = ();
print "Opening /etc/passwd\n";
open(FILE1,"/etc/passwd");
while (<file1>) {
        if ($_ =~ /(^cg\w*).*$/)
                {
                 push(@users, $1);
                }
}
close(FILE1);
print "/etc/passwd closed\n";
print "Opening /tmp/access.log\n";
open(FILE,"/tmp/access.log");
while (<file>) {
        if ($_ =~ /.*USER.*(cg.*)\".*$/) 
            {
            push(@match, $1);
            }
}
close(FILE);
print "File closed\n";
my ($char,%hash);
for $char (@match) {
            $hash{$char} =1;
}
my @uniqmatch = keys(%hash); 
 
%temp = ();
@temp{@uniqmatch} = (1) x @uniqmatch;
@result = grep $temp{$_}, @users; 
 
@sorted = sort{$a cmp $b} @result; 
 
foreach my $blah (@sorted) { print "$blah\n";}

See that part at the end? Spaghetti. I’ve written worse, but it’s not pretty. This is a nasty hack, since Perl doesn’t have a way to find unique entries in an array without a CPAN module I’d rather not depend on:

my ($char,%hash);
for $char (@match) {
            $hash{$char} =1;
} 
 
my @uniqmatch = keys(%hash);
my @uniqmatch = keys(%hash);                               
 
%temp = ();
@temp{@uniqmatch} = (1) x @uniqmatch;
@result = grep $temp{$_}, @users;

See that? Ok with me.

string regexPattern = @".*/)\s
                      (?<system>\S.*?)
                      :\s
                      (?<tape>\w)
                      \W.*,\s
                      (?<initials>.*)";
Regex re = new Regex(regexPattern, RegexOptions.ExplicitCapture);

Ok with me.

News to PHP devs: your lines end with semicolons. Break them up for readability:

$terms = $wpdb->get_results("SELECT $wpdb->terms.term_id,
$wpdb->terms.name, count FROM $wpdb->term_taxonomy INNER
JOIN $wpdb->terms ON $wpdb->terms.term_id =
$wpdb->term_taxonomy.term_id WHERE taxonomy =
'post_tag' ORDER BY count DESC LIMIT 0, " . $options['maxcount']);

Ick.

$daylimit =    date('Y', mktime(0, 0, 0, date('m'),
      date('d')-$days, date('Y'))) . "-" .
      date('m', mktime(0, 0, 0, date('m'),
      date('d')-$days, date('Y'))) . "-" .
      date('d', mktime(0, 0, 0, date('m'),
      date('d')-$days, date('Y'))) . " 00:00:00";

Double-ick. (Note that I broke that up to avoid scroll bars on the blog).

This is not at all what I intended to write about. Perhaps I’ll write another one later today which isn’t a pointless rant. Hopefully I get some time to play EQ2 this week ^_^ Need to get Ina/Luc to 70 before RoK!

Also, goodbye Wordpress Visual Editor. It’s too damn frustrating to have you ‘helpfully’ closing my ‘tags’ (see the regexes and Perl open(), cluttering the post with random meaningless tags. IGNORE THINGS INSIDE <pre>