Created Saturday, March 27, 2010 by Ryan Stenhouse
That’s the conference over for another year, and me safely back in the hotel. Sorry I had to cut this short – lack of a working internet connection for half of the day. I’ll be doing a more formal writeup of this year’s conference very soon.
Code demo now.
Example: A content distribution system. Explained with a diagram, needs the slides and video to describe properly. Boils down to a configuration server, which sends configurations to servers with various other roles.
Git is a distributed directory snapshot record keeper. You can implement something like time-machine for doing backups. That’s bloody clever.
What about size? git does deltas internaly automatically, or manually with git gc.
If you have the same content in two different files, in delta-based systems it stores deltas for each, in git it stores the content once and points to it twice. File renames are also grim in file delta storage systems. In git it’s just another tree entry with the same content in another name.
Git doesn’t use a file based delta storage system, in delta systems you have one file with all the deltas. In git you only ever have every unique combination od content, no un-necessary storage.
Git is a very generic system. Stores blobs, trees and commits. That’s it. You don;t just need to use it for version control though, but for storing and distributing content.
git log —graph is bloody cool.
Now edit the README file, update the index. upadate-index and write-tree to save it, commit it with the first parent rather than the second one, it creates a branch within git.
You can also put a git SHA in any file, you can do git log $(cat file), and it’ll give you the same output back.
History of directories snapshot system: Shows how the manifests progressed from one to the other. To link one to the other, you use git commit-tree, it takes a commit message from STDIN and the sha of a change sets. To save the second version you do the same thing, but with -p to relate it ot the sha of the parent commit object.
Git not only hardlink files of identical content, but it will also hardlink subdirectories too- so it’s not wasting storage space by storing the same things many times.
Now, to change README.text, and then git update-index the file. Takes what’s in the working-directory, snapshots it for you and updates the index for that item.
git read-tree does the oposite, reads the tree out of the database and over-writes what’s in your staging index within git – still doesn’t put files on your disk though. checkout-index takes whatever’s in your index and writes these files to your disk.
git write-tree writes that manifest into your database and gives you another SHA to manifest it. And then, cat-file gives you back a line representing all of the content.
Directory Snapshot storage system: git update-index it’s a staging area which lets you build up a manifest of files for it to store in the database. Can assingn a ‘filename’ to content that’s been manually added before. git ls-file will list the file index, which is mapped to the hash store you’ve built before.
To get it back out, git cat-file blob
It also takes a file name, but ignores it, takes the contetns and gives you back the hash block pointer. It doesn’t store the file, stores its content.
key/value store; hash-object is a push which puts content to the rpo, and cat-file pulls the file out, like pul an commit. Using hash-object you can put any content into git. Without -w it will jus t give you a hash.
Git is a simple key/value content storage system. It’s a directory snapshot storage system, and it keeps a history of these directory snapshots.
What is git really ? How does it work? What can you do with it? Trying to break down how you think about git and teach you how it fundamentally works.
Get his book for free at progit.org
GOing to be talking about doing ‘weird’ stuff with Git.
He’s up on the podium and ready to go. But there’s still some stragglers arriving.
Next talk time, I’m at ‘Gittin Down to the Plumbing’ by Sott Chacon, who is one of the folks form GitHub.
Lunch!
Q: Is the Globalize pluigin intended to be used instead of Rails’s i18n? It works with it and on top on top of it by extending how it works.
Q: How would you mix this with restful concepts? Is each translation a new resource, or is each document one? It depends on the languages involved, your audience and how you decide to write it. With the scope/prefix then you can say that /en/document/page is separate resource form /de/document/page or you can serve it up as different format, test.en.html. So it depends.
Q: Any tips on managing the translation? There’s a new gem / plugin called translate which si a CRUD web framework for translating keys, which works well to manage this, Globalize 2 might also help too. Translate only works well for the YAML based stuff.
Q: Are there any good plugins for getting you translations into javascript, and can you recommend any good transaction services? A: Jibberish was state of the art the list time I looked, but I don’t know what’s out now. Globalize 2 is in early beta and might help, but I don’t know. The second question varies too much between the languages you’re wanting to translate to.
You also ened to take into account the cultural difference between people and counties, even though the officially speak the same language. Pay careful attention to colour too.
DO: Translate static pages like error pages.
If you’re storing User data, save language preference too. Avoid machine translation at all costs.
Using subdomains for locales is a great way to do it if you can put up with the mess and hassle, the recommended way to do it is in a URI.
How to determine locale? Browser settings can tell you, but they can be changed by the user and not reliable. Geolocation is no use, since it only works where the IP address is from. Ask the user which language you should use, and present it in their language.
Use UTF8 always in your database. Also send UTF8 to the browser and e-mail.
Watch out for gems that don’t use i18, using markup might not work too well. Keys may overwrite eachother in the YAML, you must respect the user’s language preference and some languages have different translations depending on the context and the usage.
Take advantage of mySQL collation, it does it properly and it’s very clever at it.
It’s up to you to handle all text as translatable string, all the date and time formats, all the address and postcode formats, currency and number formatting and searching and sorting, ΓΈ vs o.
You can include variables and things too within the I18n stuff, so you can do string interpolation in the correct language and has a basic pluralisation implementation
Backend in Rails 2, provides a simple backend in YAML for looking up keys and swapping strings. But if you’re working with a lot of content, it gets costly. Luckily, you can change the backend with a plugin.
This guys slides are only 50% readable.
Before Rails 2, there were tonnes of plugins made available to solve the problem, but it was all monkey-patching and they were always fighting with changes in Rails to keep them up to date. In Rails 2.2 – 2.3 There’s a gem now which works with Rails to do the il8n stuff.
Breifly, Il8n in Rails. In Rails 1 – supports any language you want as long as it’s English.
Q: If your source is utf8 and you try to cast it, is ruby smart enough not to do anything? I don’t know, good question. Might be worth trying to benchmark.
Until 1.9.2 ships, end all your regular expressions with /u to make sure that Ruby treats them as UTF8.
In 1.9, you can set an encoding for pretty much everything, most useful for files. It can also do translation between encoding on-the fly. So you can make sure that you always have UTF8. RegExps in 1.9 are also strictly encoded. (This is really bloody cool)
Ruby M1.9 and Rails 2 adds multibyte support through ActiveSupport, but it adds extra method calls to chat you’re doing, but an implementation of it is in Ruby 1.9. It directly supports Unicode now. In Ruby 1.9 code, you can actually use UTF8 in your source and actually use Lamba (the greek latter) to create a lambda.
There are two levels of Unicode, there’s the actual code-point system which is about giving an address for every single character, everywhere. It’s also backwards compatible with ASCII because of how the Bite Order Mark works.
various people in other countries filled the not-used parts of ASCII with characters that only they needed. To avoid colliding character set, Unicode turned up to save the day.
Ruby < 1.9 doesn’t support non-traditional ASCII strings very well. Digressing into character sets, showing an old bitmap of the first ASCII table. But it has a huge ‘western’ (He means US) bias.
Three steps – Support urtf8, internationalise your application, and localise your application.
Almost ready to go, it seems.
I’m really interested in hearing how Joshua approaches doing this in his talk, as a polyglot I have a bit of a vested interest in developing multi-lingual / multi-national applications.
Setting up now, the Rumble in the Jungle guy over-run. Really wanting to hear Notitia linguarum – Multinational Ruby and Rails
Waiting for this guy to finish so we can do the Miltilugual talk!
Speed Runs are not sustainable and do not make you better as a programmer.
And the straight man is won over to the heretical way of doing things wrong!
Write Bad Code: Sometimes.
Freeze the speedrun with tests, and then refactor it, Treat it as bad until you do it properly.
Keep yourself up-to-date, keep learning because in a speedrun, you can no longer do any learning. Once your speedrun is done, bin it. It’s disposable. Kill it with fire. You’re going to re-write it at production quality, or you can re-factor to pay off the technical debt.
Deploy! Make Heroku dirty. Use it as your remote git repository, you can’t push without deploying. Don’t need a filesystem.
AJAX in 10 seconds; a bit dodgy. So he rakes it b doing window.location.reload();
Login in 60 seconds now, use the simplest possible thing to work – if you just need a password. Change it when you deploy.
But don’t go too far!
Social in 60 seconds – Basically Newsfeeds. Beta feed, so everyone is already friends. No public / private issues. Excellent 15 lines of code to do it on screen.
It’s still KISS, and it’s still YAGN. gem: —no-rdoc —no-ri in your .gemrc to speed you up. It’s important to use Real Data though, especially if you’re going to show off your application to a stakeholder. Only takes time to do it properly.
Denmorlaize – If you’re using an RDBMS you don’t need to use it properly! You’re prototyping, not writing production code.
Use what you know, Don’t need migrations where we’re going. auto_migrations lets you hack your database schema.rb file directly. But better still, Ruby – not SQL.
MySQL, not SQLite.
github.com/newbamboo/bamboo_scaffold
Rule 5: Cheat. Sacrifice Performance. Generators aren’t useful, reading code is slower than writing it.
Tests? Don’t need them if there’s risk of you being hit by a train. No automatic tests on a speedrun, runthrough only. Straight guy explaining tests are good – back to the Production vs Prototyping thing. This talk is very well done.
The big deal is that this isn’t for production code, they keep underlining it. This is for quickly seeing how / when / if something will work. Speed Runs are for teams of one. Like a motorcycle.
Do one thing at a time. Have the discipline to postpone stuff if it doesn’t need to done now.
Having an Observer woudl be good, or another person doing a speed run might help you getting your timeboxing right.
Pair Programming – Having a Wingman slows you down on a speedrun, it’s good for writing elegant code; but not for speed-runs. We’re sacrificing being better programmers on this limited spike for speed.
Never look back – we’re playing chicken with a train, need to look forwards!
Don’t delay the feedback too much.
Take pride in the speed of your development, not the elegance of your code when you’re going for speed. Keep shipping and keep the project moving.
Rule 3: Maintain Mommentum – Keep the excitement!
Timeboxing should be a punch to the face! Stick to it, if it’s taking too long bin it.
Don’t go with the first bit that can work, not the necessarily the best thing.
Code with a nice bottle of wine, it’ll suppress the urge to get things working perfectly. Experiment.
Learn how to USE a plugin, now how it works on a Speed run. Your sacrificing speed for knowledge.
HOW Far away is he train?
Skip code, not planning
Start with a question (or a vision)
Don’t write any code that isn’t to answer the question, get an answer before you’re hit by the train
How done is Done Done?
Work out the end user, and what they need to do. S
Rule 2; S Do what you know
Go Offline – Reduce the options to reduce the time taken to reserch. Go offline and don’t look at google, only rely on your own knowledge
Frameworks? Not fast in the short term.
You’re ether experimenting the problem domain or the technical domain
When you’re doing a speed run, you don’t get to learn new things. IN the long terms, frameworks make you better. In the short term for a sprint, you get quicker results.
Lighter is not faster.
But not if yoiu doon’t know the light framework
Plugins may be an exception.
back on the blog!
Battery running out.
Also, to survive. In a startup, you’re sitting on a ticking bomb, same with personal projects, time is your ticking bomb. Knowing you’ll have technical debt, you can get short term gains for long term pain. Graph with the ‘Death Box’.
Also, to prove is something is technically possible. Not on a business level, but purely in a technical level.
ALso a good way to check to see if something really is worth eventually doing. Before building it well comes after building something that works for a prototype.
To talk, sometimes you need a thing to show to people to explain how something will work. Hacked something together with smoke and mirrors to show how something will work and engage your customer.
Why write bad code? We shouldn’t always right ‘good’ code, sometimes we just need the speed. Sometimes code quality isn’t the most important thing.
Bad code gives you bad technical debt. Nice graph/ You get bugs in ‘non-poetic code’
Canned heckling from other Bamboo Staff. Bartosz Blimke has turned up, he’s the straight man in this double act.
There are times when you need to do things quickly, Gwyn Morfey, writes bad code in London and he’s here to tell us how.
Overly dramatic countdown timer is overly-dramatic, but also a little cool. I guess that means he’s starting in 10 seconds…
Guess it wasn’t about to start… More people are arriving now. And the disembodied voice of Paul has appeared.
Talk about to start.
Next talk – ‘Write Bad Code’
Talk over!
Q: You had a method the document ‘of_user’ Why didnt you use an association? That was for the example, it was just for the slides and they generate identical queries so they’re equivilent.
Q: When Rails doesn’t do something expected, it’s a bug. Fix it? I agree. Q: The use index stuff – are you using hard queries or have you done anything to AR to make it nicer? We’re using the SQL, since there’s a culture in the company against using monkey patches.
Q: On the first slide, you showed that hte declarative code didn;t do what you thought it should do, and you shouldn’t be testing rails – did this change your view on testing rails? YEah, it made me want to test more stuff – things that changed the way how I tested – even things that I should not be testing.
Q: Are you using a stock mySQL? No, we use percona? MySQL.
Bloody interesting talk this! He’s recapping now. Talk is onling at a unicode URI I can’t type.
The problem with indexes: They’re the pain for big / busy websites. Paranoid delete, adds a timestamp. MySQL only uses one index per where clause. And an index on two columns rather than one is a possible solution, but you can’t just add idexes willy-nilly. JIT Solution: Add index to a slave, or you can use USE INDEX or FORCE INDEX on your query to make MySQL use a specific index.
It recursively memoizes and freezes everything you’ve got if the model has associations – fills out all your memory.
It’s all metacode, and it Freezes values.
It does do some more clever caching and separates caching form main logic.
Solution – Use the real AR methods. Now the problem with memoize. Executes a query every time it’s called, most people use an instance variable to get around it. But rails 2.2 lets you do memoize
association.delete_all does everything BUT delete things. There’s a nice table on the slide. It does make sense. but it keeps selecting before updates and does select before deletes.
You need to move before_destroy before your associations or it won’t work. Bad AR!
Category model to group documents and memberships together, with a dependent delete_all.
The problem with delete and destroy – Rails is poorly documented here for the AR association proxy.
GitHub Gist 105367 explains it, and only works in MySQL.
Need to use the BINARY keyboard to make the query do a case insensitive comparison.
In oracle or postgres, use a functional index. In mysql, SELECT is implicitly case-insensitive.
The problem with validates_uniqueness_of. Needs a case-insensitive in Scribd. WHERE LOWER(login) query generated, over the entire databse, it took 2 minutes to run every time a user tried to sign up.
Gist 105218 on github for code!
find_in_batches only works if you have table.primary_key.
Problem with find_in_batches – Instantiating every record in database at once is a silly thing to do! find_each is handy
Talk will be MySQL specific with a little PostGres thrown in.
Rules of Thumb: Always understand the queries your code is generating. EXPLAIN queries you don’t understand. Test with a heavily populated database to get an idea of speed issues. Pay close attention to your indexes.
github.com/riscfuture and @riscfuture on twitter.
Here we go, Talking about about problems of scale and basically how he kept breaking Scribd as he worked on it.
First talk! Tim Morgan ‘Oh Shit’ – Lessons in continually breaking a huge website. He works for Scribd.
I’m very late for the conference this morning, apologies for that. So I’ve missed Time Bray’s Keynote – one I was particularly looking forward to. Still, there’s the rest of the date to write about.