HTTP URL Validation Improved
I found the HTTP URL Validator for Rails very interesting, and well coded, yet it lacked some things such as URL format restrictions. I added some things, and I came up with a sweet solution.
It checks the format of the given URL, the content type, and whether it was permanently moved. I might be adding to this in the future.
#Check for content type:
validates_http_url :url, :content_type => "text/html"
#Do not check for content type, just make sure the site is accessible:
validates_http_url :website
#Make sure there is a DNS entry for a domain
validates_http_domain :domain
# Domain must be in 'www.site.com' for or 'site.com' form.
# No http://, no path.
Update (6/26/06)
Added the validates_http_domain method:
def validates_http_domain(*attr_names)
validates_each(attr_names) do |record, attr_name, value|
# Set valid true on successful connect (all we need is one, one is all we need)
failed = true
possibilities = [value, "www."+value]
possibilities.each do |url|
begin
temp = Socket.gethostbyname(url)
rescue SocketError
next
end
failed = false
break
end
record.errors.add(attr_name, "cannot be resolved.") if failed
end
end
Now I can just use
:validates_http_domain :website
in my controller, and everything comes up roses. ;)
Update (9/30/06)
It was brought to my attention through a dialogue of emails and the comments that I needed a simple way for people to modify the plugin to accept different codes depending on their needs, or at least a simple way for me to modify the default accepted codes. Therefore, I made an array in the library called allowed_codes:
allowed_codes = [
Net::HTTPMovedPermanently,
Net::HTTPOK,
Net::HTTPCreated,
Net::HTTPAccepted,
Net::HTTPNonAuthoritativeInformation,
Net::HTTPPartialContent,
Net::HTTPFound,
Net::HTTPTemporaryRedirect,
Net::HTTPSeeOther
]
I’ll make it so you can push on your own custom codes from the model soon. This is what I envision:
validates_http_url :website, :extra_codes => [ HTTPResetContent, HTTPPartialContent ]
I’ll post here when this is reality, and probably make another blog post as well so the aggregators get it.
You can download it with svn:
svn co https://modzer0.cs.uaf.edu/repos/hank/code/http_url_validation_improved
Or use it as a plugin:
./script/plugin install -x https://modzer0.cs.uaf.edu/repos/hank/code/http_url_validation_improved
The above command only works if you have your entire rails project in subversion. If you do not, which I don’t recommend, you should either add it to a repository or alternatively remove the *-x* from the command. Of course, this will remove support for updating to the new code if I make a change.


Apie
June 16, 2006 at 4:40 PM
Is the [:content_uype] perhaps a typo?
unless configuration[:content_uype].nil?
record.errors.add(attr_name,
configuration[:message_wrong_content])
if response['content-type'].
index(configuration
[:content_type]).nil
end
Hank
June 16, 2006 at 4:40 PM
Ahhh. Thank you! Fixing it now.
Apie
June 16, 2006 at 4:40 PM
Thank you for the nice code. I used it to validate a number of URL’s. I just wanted to know whether the domainis a real. I decided that response codes > 400 except
good_codes = [401,402,403,405,406,407,408] are okay. Any thoughts on this?
Hank
June 16, 2006 at 4:40 PM
Perhaps I should make a new method – validates\_http\_domain – that way you can just check the dns entry for the domain. Not a bad idea. I’ll revise the code and add this method. Thanks for the input!
Apie
June 16, 2006 at 4:40 PM
That would be splendid! I’m gonig to add this post to my ebtags and will check in in a while. Looking forward to integrating the code. My code is a bit of a hack at this stage, and I would feel much beter knowing its properly developed :)
Will this method return true for something like when yahoo.com redirects you to http://www.yahoo.com – this would be great since from a human perspective yahoo.com is as valid as http://www.yahoo.com
Many thanks!
Hank
June 16, 2006 at 4:40 PM
OK – it’s all set – test it out – I know I am. ;)
chemp
June 16, 2006 at 4:40 PM
intresting point!
Caleb
June 16, 2006 at 4:40 PM
Hmm, this script seems to be having trouble with https://google.com.
Any Ideas?
Caleb
June 16, 2006 at 4:40 PM
Not able to find a way to contact you directly, I thought I’d just post a modified block of your code that allows the validator to validate both http AND https urls:
======================
url = URI.parse(value)
url.path = "/" if url.path.length < 1
http = Net::HTTP.new(url.host, (url.scheme == ‘https’) ? 443 : 80)
if url.scheme == ‘https’
http.use_ssl = url.scheme == ‘https’
http.verify_mode = OpenSSL::SSL::VERIFY _NONE
end
response, body = http.get(url.path)
===========================
Note that there is a space between VERIFY and _NONE on the OpenSSL line. There seems to be some wierd formatting issues when posting to this page and the _ is removed with out the space.
Hank
June 16, 2006 at 4:40 PM
Thanks, Caleb. I’ll merge it.
sid137@gmail.com
June 16, 2006 at 4:40 PM
Hi,
When I try to use this to validate the following URL
http://www.youtube.com/watch?v=vFP-MktgOKU&eurl=
It tells me that the page is inaccessible.
I use a simple
validates_http_url :url, :on => :create
for my model, and am using svn version 136 I think..
Any ideas?
Thanks
Hank
June 16, 2006 at 4:40 PM
Here’s what I get (rev 138):
Website is not accessible Net::HTTPSeeOther
This is because a 303 response doesnt count as valid in the current state of the plugin. I’ll make it valid since it obviously is. **svn up** at your leisure. In the future, you can just edit the **allowed_codes** array and submit a patch if you’d like:
<typo:code lang=’ruby’ title=’lib/http_url_validation_improved.rb’>
allowed_codes = [
Net::HTTPMovedPermanently,
Net::HTTPOK,
Net::HTTPCreated,
Net::HTTPAccepted,
Net::HTTPNonAuthoritativeInformation,
Net::HTTPPartialContent,
Net::HTTPFound,
Net::HTTPTemporaryRedirect,
Net::HTTPSeeOther,
]
</typo:code>
Thanks for bringing this up. What a weird response.
Walter McGinnis
June 16, 2006 at 4:40 PM
Nice. You fixed the problems I was having with the validates_http_url plugin. The only thing I’m still wanting is better handling of formatting of the url. Your code just kicks it to the rescue, but doesn’t give the user back anything meaningful in an validation message.
For now I’m just going to do the validation of format separately with validates_format_of, but would be a nice future enhancment.
Cheers,
Walter
Walter McGinnis
June 16, 2006 at 4:40 PM
Here’s a patch that finishs the work you started for formatting starting at line 76:
# if response is nil, then it’s a format issue if response.nil?
record.errors.add(attr_name, configuration[:message_url_format])
else
# Just Plain non-accessible record.errors.add(attr_name, configuration[:message_not_accessible]+" "+response.class.to_s)
end
riki
June 16, 2006 at 4:40 PM
Looks great but I’m getting a password dialog when trying to download the plugin.