The blocks used by iterators (such as loop and each) are a little different. Normally, the local variables created in these blocks are not accessible outside the block.
[ 1, 2, 3 ].each do |x|
y = x + 1
end
[ x, y ]
produces:
prog.rb:4: undefined local variable or method `x’ for # (NameError)
However, if at the time the block executes a local variable [that] already exists with the same name as that of a variable in the block, the existing local variable will be used in the block. Its value will therefore be available after the block finishes.
My first impression was that it was wierd for blocks to have access to variables outside the block. But after working with the language, I realized that was perfectly natural and consistent with blocks becoming closures. And that the wierd thing was that local variables created in the block were not accessible outside the block.
For example, it’s easy to find the sum or product of an array of numbers when the block has access to variables that have been previously defined:
1
irb(main):001:0> sum = 0
=> 0
irb(main):002:0> (1..10).each { |i| sum += i }
=> 1..10
irb(main):003:0> sum
=> 55
And it’s wierd that this:
1
irb(main):001:0> if false then last = nil end
=> nil
irb(main):002:0> (1..10).each { |i| last = i }
=> 1..10
irb(main):003:0> last
=> 10
behaves differently than this:
1
irb(main):001:0> (1..10).each { |i| last = i }
=> 1..10
irb(main):002:0> last
NameError: undefined local variable or method `last' for main:Object
from (irb):2
The good thing is that it jumps up and bites you right away.
22 Mar:inject, inject, inject – OK, summing values was a bad example. I still maintain that having access to local variables outside the block is a better fit to the language.
I have updated my comment and trackback package to allow comments and trackbacks to be downloaded as xml. The download link is available from the admin interface (presented at the bottom of the index page).
Tag Moderation for Trackback and Comments – trackback pings and comments are immediately available with most html tags removed (p allowed for readability). Sanitized tags are displayed upon approval.
Outbound Link Blacklist – trackback pings and comments that contain outbound links to banned sites are blocked. The user is responsible for providing a function identifying banned sites.
Rejection of trackback pings and comments that contain an excessive number of links.
Trackback and Comment deletion (unfortunately, this does not propagate to the RSS feed).
Comment Editing
Requirements
Web Server capable of running cgi scripts.
Perl with the following Perl Modules (I believe that these are core modules as of Perl 5.6.0):
For my first foray into Ruby, I’ve created an HTML sanitization method. It is partially based on Brad Choate’s perl sanitize_html (used in my standalone comments and trackback package). While this was not a good exercise in learning Ruby objects, it was a good exercise in Ruby regular expressions and String replacement.
With no further ado, here’s my annotated sanitize_html in Ruby:
A basic method declaration. The default set of allowed tags and attributes is provided as the default value for the okTags argument. The soloTags array contains tags that don’t require a closing tag.
1 2 3
defsanitize_html( html, okTags='a href, b, br, i, p' ) # no closing tag necessary for these soloTags = ["br"]
We begin by building an allowed html tag hash. The hash keys are the allowed html tags and the hash values are arrays of allowed attributes for the respective tag. Here’s the blow by blow breakdown in irb:
1
irb(main):001:0> okTags = 'a href, b, br, i, p'
=> "a href, b, br, i, p"
irb(main):002:0> tags = okTags.downcase.split(',')
=> ["a href", " b", " br", " i", " p"]
irb(main):003:0> tags.collect!{ |s| s.split(' ') }
=> [["a", "href"], ["b"], ["br"], ["i"], ["p"]]
irb(main):004:0> allowed = Hash.new
=> {}
irb(main):005:0> tags.each do |s|
irb(main):006:1* key = s.shift
irb(main):007:1> allowed[key] = s
irb(main):008:1> end
=> [["href"], [], [], [], []]
irb(main):009:0> allowed
=> {"a"=>["href"], "b"=>[], "p"=>[], "br"=>[], "i"=>[]}
And here’s the corresponding code:
1 2 3 4 5 6 7
# Build hash of allowed tags with allowed attributes tags = okTags.downcase().split(',').collect!{ |s| s.split(' ') } allowed = Hash.new tags.each do |s| key = s.shift allowed[key] = s end
Next, we perform a substitution on all <…> elements. We specify a non-greedy, multi-line regular expression (? and m respectively).
1 2 3
# Analyze all <> elements stack = Array.new result = html.gsub( /(<.*?>)/m ) do | element |
It’s a closing tag. After verifying that it’s allowed and that the opening tag has already been seen, use the stack to keep tags in matched pairs.
1 2 3 4 5 6 7 8 9 10 11 12 13 14
if element =~ /\A<\/(\w+)/then # </tag> tag = $1.downcase if allowed.include?(tag) && stack.include?(tag) then # If allowed and on the stack # Then pop down the stack top = stack.pop out = "</#{top}>" until top == tag do top = stack.pop out << "</#{top}>" end out end
It’s a solo tag. Pass through if allowed.
1 2 3 4 5 6
elsif element =~ /\A<(\w+)\s*\/>/ # <tag /> tag = $1.downcase if allowed.include?(tag) then "<#{tag} />" end
It’s an opening tag. Push it onto the stack if it requires a closing tag. Replace with a simple opening tag if there are no allowed attributes. And sweep through the matched element testing for allowed attribute-value pairs if there are allowed attributes.
elsif element =~ /\A<(\w+)/then # <tag ...> tag = $1.downcase if allowed.include?(tag) then if ! soloTags.include?(tag) then stack.push(tag) end if allowed[tag].length == 0then # no allowed attributes "<#{tag}>" else # allowed attributes? out = "<#{tag}" while ( $' =~ /(\w+)=("[^"]+")/ ) attr = $1.downcase valu = $2 if allowed[tag].include?(attr) then out << " #{attr}=#{valu}" end end out << ">" end end end end
Our previous substitution was on matched <…> elements. Now, clean up any >’s that are prior to the first <…> element and any <’s that follow the last <…> element;
1 2 3 4 5 6
# eat up unmatched leading > while result.sub!(/\A([^<]*)>/m) { $1 } doend
# eat up unmatched trailing <
while result.sub!(/<([^>]*)\Z/m) { $1 } doend
If there are any tags left in the stack, then append the appropriate closing tags to the string.
1 2 3 4 5 6 7
# clean up the stack if stack.length > 0then result << "</#{stack.reverse.join('></')}>" end
I thought that the ESPN Dream job was pretty cool. But that was nothing compared to this:
This year, the winner of a select fantasy league, in which people use the statistics of real-life players to simulate games, will get an actual front-office job with the San Francisco Giants.
I never played fantasy baseball. And I’m on the wrong coast. But a year with Brian Sabean, crunching numbers and evaluating players, that would be something.
PHP wasn’t different enough. I was prepared to convert to PHP. But upon further review, I think that learning PHP is just learning PHP. It’s great at what it does, but it’s not going to lead to a new way of thinking about programming.
I’m queasy about white space in Python. Anyone who has worked with my code knows that I’m a stickler about proper indentation. And once upon a time, before I converted to emacs, Python would have been attractive. But now that I rely upon emacs to manage my indentation for me, managing it myself seems backward.
I need to take a look at Ruby on Rails. I’m a sceptic, but the buzz about Ruby on Rails is hard to resist. And with first things coming first, it must be time to learn Ruby.
With the announcement of the tournament brackets, it’s time for the annual complaining about seeds. Personally; I think that it’s the matchups not the seeding, the 1-seed doesn’t get any extra points. And for my money; if you can’t be a high seed (1-4), then 11 is the lucky number. Your road to the sweet sixteen goes through a 6-seed and a 3-seed. Good teams, but flawed; the kind of teams that get beat by Cinderella on the way to the ball.
Stanford men get an 8-seed. Just about the worst seed you can get. You’re guaranteed a tough game against a 9-seed and a 1-seed is penciled in for your second game. If Stanford gets past Mississippi State and Duke to the sweet sixteen, then I’ll be celebrating like it’s a national championship.
On the distaff side of things, the #1 ranked Stanford women get a 2-seed. And to make things worse, I don’t really disagree. The poll punishes late losses, rewards late wins, and never really corrects for strength of schedule. In a down year [decade?] for the west coast, Stanford didn’t eard a 1-seed.
That being said, what did Stanford do to deserve a projected matchup with 3-seed Connecticut? UConn started the year slow. But it looks like they’ve finally come to grips to basketball without Taurasi. Right now, I think that UConn is the strongest of the 3-seeds and I wouldn’t be surprised if they made the final four.
In previous years, Stanford has gone into the tournament as a top seed. With all the pressure and expectations that come with that seeding. This year, with a middling seed and a thin bench, every extra game will be gravy.
This proof of concept retrieves the manually created comments sub-table for this post from the Radio object database and creates the corresponding text for inclusion on the rendered page. A real implementation would retrieve comment data from the comment server and store that data in the appropriate comment sub-table.
I have some issues with the results. The comments are rendered on all pages (main index, monthly archive and daily). I would like for the comments to only be rendered to the daily pages and for the index and monthly archive pages comment links to reference the rendered comments on the daily page. Unfortunately, I don’t have a good plan to achieve that.
PS: I know that the counter in the comments link doesn’t match the number of comments on the page. Just another detail to be worked out.