I’ve been looking at methods of creating search engine friendly URLs in CakePHP. There are several articles about it and one method I was the most satisfied with was this article on the cake bakery. One of the things I liked most about this approach was that it would automatically add an incremented number to the url if an identical title was found. This is good, because I don’t want to have to worry about the manual process of making sure the url name is unique.
Let’s talk about password hashing. Password hashing is the recommended way to store passwords in the database. However, there is a right way to do it and a wrong way to do it. Let’s cover some of these techniques.
Hashing algorithms like MD5 have been around for a while. MD5 now days has problems and it isn’t recommended that you use MD5 alone. Rainbow tables are so established now that guessing an MD5 hashed password takes no time at all. My understanding is that SHA1 isn’t far behind. The solution to beating rainbow tables is to salt your hash. Salting is the process of adding some extra data to you password before hashing it, to make it less predictable.
Salsa is really easy to make. It’s probably also much cheaper to make than it is to buy a jar of it. Plus it taste better. Salsa is also a healthy snack. It’s what you eat with it that may not be so healthy. So consider this a part of my coding and cooking series, even though I’m not going to talk much about coding in this article.
I don’t really have exact measurements for this recipe either, so depending on how much you make you may have to experiment with it. I use a magic bullet, but a blender or food process will probably work too.
It’ll be a little warm from the sauteed tomatoes. I like my salsa chilled, so I usually throw mine in the frig for an hour. Eat it up with tortilla chips or regular corn or flour tortillas. If you want to make it healthier, try putting it on top of fish or beans.
Don’t forget to continue coding!
Every now and then I come up with an idea. Most of the time it’s nothing big, but it usually has something to do with with something I enjoy doing. Here is one of those ideas.
I’m sure you have experienced those days or nights where you’re coding away on something and after a while you realize that you have missed dinner or lunch. What about cooking something up? In my house, we have started making a lot of things from scratch. Bread is one of those things. Some thing bread is hard to do. I don’t think it’s hard at all. Here is a great YouTube video on making bread. This guy also gives a great explanation of how yeast works.
Here are some updates to what’s going on. I’ve been working on hard on many different projects, but not a lot of focus on a single project. I’ve been dabbling in encryption and hashing libraries for projects, a design update for stinkbug.net, small updates for HiddenWord.org, bouncing around ideas for a budget application. Plus more.
One thing I’ve been researching a lot lately is simply marketing. Some social media marketing, but probably more than that, just plan old “what works best for this website marketing”, including things like A/B testing, etc. This type of marketing takes a lot of time, effort, and energy. Not to mention testing over and over again. I’m determined to get better in this area though. Finding out what works and what doesn’t work.
I’ve been bouncing around ideas for a stinkbug.net redesign for a while now. I’ll work on it for a little while and then go off to something else. I just can’t seem to focus on it for a very long period of time. My plan is to simplify the design significantly with the hopes that the simplicity of the design will last for years to come. Redesigning this site every year, or even making compatibility updates, is not something I want to be doing all the time. I think I may also start the process of cleaning up some of the older post on this site.
I’ve created a sign up form for anyone that may be interested in getting updated on budget application that I’m thinking about developing. If you’re interested in checking it out, go sign up for updates. This helps give me an idea of how many people are interested or not.
Updated June 5, 2011: We will be doing some short tests, so you might see some hits from our crawler.
After doing some preliminary programming, I’ve done enough to understand the goals I want to set for building this web crawler. Now I’m ready to publish these goals.
The name of the crawler is StinkbugBot/0.1 – http://www.stinkbug.net. This will of course change as newer versions come out.
This crawler is not going to index pages that it can’t find on another page somewhere, unless it’s specifically told to. At some point there may be a way for people to do that. Right now I’m the only one that can do that. After pulling a page that has yet to be crawled the crawler will extract all the urls on the page and add them to a cue to be crawled at a later time.
To start with the crawler will not crawl a page more than once. However, this functionality will change later to be a little more sophisticated. The idea here is that it can increment its crawl passes on pages as it runs out of first time passes (if that actually ever happens), or is specifically told to do so in another instance.
The crawler will wait 10 seconds between requests to the same site. Hopefully this will prevent the crawler from over loading the site. It’s not like I have a ton of processing power to run this crawler anyway. I haven’t actually tested this part yet, so I’m not sure how I’m going to do it.
I would also like to recognize robots.txt files, but I’m not sure if that will be a part of the first version. That might be a later version.
There must also be a way for me to specify that a particular domain doesn’t get crawled. This would be in case some one complains that my crawler is bogging down their site.
As a web developer some projects are more challenging then others. I love challenges. I believe that building a content management system is one of those bigger challenges to accomplish. I’ve spent the last several years with a main focus on doing just that. Of course it’s not perfect, but I feel that I have hurdled that challenge. Recently I’ve been thinking about new complicated challenges. Something I’ve always wanted to do, that I believe would be a huge challenge, would be to build a web crawler. I think that might be my next project.
I know some of you are thinking, why I would want to do that, especially since it’s already been done and there’s really no need for it. For me, that’s beside the point. I want to do it for the experience, for the challenge, for the pure complexity of it. To learn and to be able to say I’ve done it or even that I’ve failed to do it. If I decide to tackle this project, I’ll be posting some guidelines later that the bot will have to go by.
I’ve been reading about some basic information on web crawlers on Wikipedia.
Jeremy Keith recently tweeted:
“Inside the box is the new outside the box.”
I personally, love this statement. Let me explain why. Over the years I have heard comments like “Think out side of the box”, practically meaning (at least in my situations) that we should think how do something like no one else has ever done it. Ten years ago when the web was still fairly new and many of the technology standards hadn’t been completely adapted by designers and developers, thinking outside of the box was fine. We didn’t have to worry so much about breaking any rules, or people with disabilities, or mobile devices, or even different browsers, because Internet Explorer had 98% of the market.