Monday, 14 October 2013

Creating An OpenShift Web/Spring Application From The Command Line

OpenShift offers online functionalities to create applications, but this can also be achieved from command line with the RHC Client Tool. For windows, you will first need to install RubyGems and Git. The procedure is straightforward.

Git SSL Communication 

OpenShift requires SSL communication between local Git repositories and the corresponding server repositories. Generating SSH Keys for TortoiseGit on windows can be tricky, but this post tells you how to achieve this.

Maintenance

From time to time, run the following commands for RubyGems and RHC updates:

> gem update --system
> gem update rhc

Creating the Spring application

Under Windows, open a cmd windows and go to the directory where you want to create the application. Assuming you want to call it mySpringApp, run the following command:

> rhc app create mySpringApp jbosseap-6

The application will be automatically created and the corresponding Git repository will be cloned locally in your directory.

'Unable to clone your repository.'

If you encounter the above error, you will need to clone the Git repository manually. Assuming that TortoiseGit for Windows has been installed properly and that you have generated your SSL keys for Git properly, right-click on the directory where you want to clone the Git repository:

OpenShift git cloning manually

Enter the SSL URL in the first field (you can find itL under 'My Applications' in OpenShift). Make sure you check Load Putty Key and that the directory points at you .ppk file.

This solution has been made available on StackOverflow too.

Making it a Spring Application

To transform the above application into a Spring application, follow instructions available here.

If you cannot execute Git from the command line, it is most probably not in the classpath. You will need to add it and open a new command line window.

That's it, you are ready to go. Open the application in your favorite IDE. Don't forget to (Git) push the application to make it accessible from its OpenShift URL.

Tuesday, 1 October 2013

September 21-22-23, 2103 - Search Queries Not Updated in Google Webmaster Tools

This week-end, many people have started reporting the same issue in Google's Webmaster Forum: no more daily search queries information updates. For most, the data reporting stopped on September 23, 2013, but I have observed this since September 22, 2013.

Yesterday, a top contributor has announced that this issue had been "escalated to the appropriate Google engineers". He mentions this issue started on September 21st. Therefore, it has been 9 days before someone could confirm that Google is aware of it. Google Webmaster Tools (GWT) is known to lag 2 or 3 days behind when it comes to search query data, which explains why most webmasters only started to ask questions at the end of last week. This issue made the headline of Search Engine Roundtable too.

In the confirmation post, a link to a 2010 video has been posted. Matt Cutts discusses which types of webmaster tools errors should be reported to Google. He mentions that Google engineers are a bit touchy when they are asked whether they monitor their systems. So did Google knew about this issue since September 21st and deliberately decided not to answer posts in the Webmaster Tools forum for 9 days or did they just miss it, because it was not monitored?

Many people have been hit by the recent Panda updates. August 21st, September 4th and more recent dates have triggered a lot comments in forums. Many websites lost all their traffic without any explanation. No message in GWT, no manual penalty, nothing. Some of these sites were using plain white hat SEO. Webmasters working hard to produce quality content need GWT search query data feedback, especially when they believe some of their sites have been hit by recent updates. It helps them find out whether they have implemented the proper corrections or not.

On September 11th, a new Matt Cutts video was posted about finding out whether one has been hit or not by Panda, and whether one has recovered from it or not. Unfortunately, it does not contain clear cut information answering the question. This video only confirms that Panda is now integrated into indexing and that one should focus on creating quality content. Google's interpretation of quality content is still vague, yet they have implemented algorithms to sort web pages.

If there is a bug impacting customers using their service, why isn't Google officially open and communicative about it? This has been an ongoing complaining from webmasters. I can understand that Google does not want to give too much information about their systems. They don't want hackers too exploit these against them. However, it clearly seems that the focus is more on not communicating with hackers than communicating openly with regular webmasters. Is Google on the defensive mode?

Google is capable of algorithmically detecting when a website (or some part of a website) has quality issues. It does not hesitate to penalize such websites. Then, why doesn't Google communicate automatically about these issues to regular webmasters in GWT? It is algorithmically possible and scalable too. Google is not the only party interested in creating quality websites. It is in the interest of regular webmasters too. Of course, hackers would try to exploit this information, but overall, if regular webmasters had this information too, they would create better content than hackers too. Users would still sort between good and bad websites, not only Panda.

Sometimes, it really seems like Google does not truly want to collaborate with regular webmasters. I notice selective listening followed by monologues. Ask me questions and I'll answer them. I won't acknowledge any flaws, but I'll secretly work on these so you can't poke me again. This is not a collaborative dialogue, it is a defensive attitude. I believe that acting with excessive caution directly hampers the achievement of one's own objectives.

My strong opinion is that if Google solved this communication issue, it would bring much more return than any other stream of tweaks to their Panda algorithm. Give people the information they need to do a good job, empower them, trust them. Right now, the level of frustration is pretty high in the webmaster community. Frustration leads to lack of motivation. Lack of motivation decreases productivity. No productivity means not a chance to see new quality content or improvements.

There is a needless vicious circle and Google can do something about it, for its own good too.

Monday, 9 September 2013

Best Responsive Design Breakpoints

While trying to find an answer to my own question: "What are the best responsive design breakpoints?", I have performed a small statistical study over SmartPhone screen widths (portrait and landscape) using information provided by i-skool.

SmartPhone Screen Width Study (Portrait & Landscape)

The above table shows how often a specific SmartPhone width in the source data is reported. There are five peaks:
  • 320 pixels
  • 480 pixels
  • 768-800 pixels
  • 1024 pixels
  • 1280 pixels
These look like good responsive design breakpoint candidates.

Best Google Ad Formats

Google offers several ad formats. Assuming the following breakpoints, here are examples of adequate ad formats with regards to width:
  • 320 to 479 pixels - Mobile Leaderboard (320x50), Half Banner (234x60), Medium Rectangle (300 x 250)
  • 480 to 767 pixels - Banner (468 x 60), (468x15) Displays 4 links
  • 768 pixels and above - Leaderboard (728 x 90), (728x15) Displays 4 links

Saturday, 7 September 2013

Sep 4th, 2013 - Sudden Drop In Traffic - A Thin Or Lack Of Original Content Ratio Issue?

Many people have reported a sudden drop in traffic to their websites since September 4th, 2013.

Google Webmaster forum is full of related posts. A Webmaster World thread has been started. Search Engine Roundtable mentions 'early signs of a possible major Google update'. A group spreadsheet has been created. No one seems to make sense of what is happening. There is a lot of confusion and speculation, without any definitive conclusion.

I have seen a major drop in traffic on a new website I am working on since then. However, the traffic for this blog has remained the same. No impact.

I am going to post relevant information and facts here as I find them. If you have any relevant or conclusive information to contribute, please do so in the comments and I will include them here. Let's try to understand what has happened.

Symptoms

Facts & Observations

  • Many owners claim no black hat techniques, no keyword stuffing, only original content, legitimate incoming links.
  • Many owners say they have not performed any (or significant) modifications to their website.
  • All keywords and niche are impacted.
  • Both old and new websites are impacted.
  • Both high and low traffic websites are impacted.
  • Some blogs are also impacted.
  • It is an international issue, not specific to a country, region or language.
  • Site with few backlinks are also penalized, not only those with many backlinks.
  • Nothing changed from a Yahoo or Bing ranking perspective.
  • One person mentions a site with thin content still ranking well.
  • At least one website with valuable content has been penalized.
  • Several sites acting as download repositories have been impacted.
  • Some brands have been impacted too.
  • So far, Google has nothing to announce.
Also:
  • In May 2013, Matt Cutts announced that Panda 2.0 is aiming at getting better at fighting blackhat techniques. He also announced that upcoming changes include better identification of websites with higher quality content, higher authority and higher trust. Google wants to know if you are 'an authority in a specific space'. Link analysis will be more sophisticated too.

Example Of Impacted Websites

  1. http://www.keyguru1.com/
  2. http://allmyapps.com/
  3. http://www.techerhut.com
  4. http://domainsigma.com/
  5. http://www.knowledgebase-script.com/
  6. http://alternativeto.net/
  7. http://www.pornwebgames.com/ (adult content)
  8. http://www.fresherventure.net/
  9. http://www.casaplantelor.ro/
  10. http://www.dominicancooking.com/
  11. http://www.weedsthatplease.com/
  12. http://www.medicaltourinfo.com/
  13. http://www.safetricks.net/
  14. http://www.qosmodro.me/
  15. http://botsforgames.com/
  16. http://blog.about-esl.com/
  17. http://last-minute.bg/
  18. http://www.shine.com/
  19. http://www.rcmodelscout.com/
  20. http://itech-hubz.blogspot.nl/
  21. http://gpuboss.com/
  22. http://www.taxandlawdirectory.com/
  23. http://www.newkannada.com/
  24. http://places.udanax.org/review.php?id=33
  25. http://www.seniorlivingguide.com/
  26. http://listdose.com/
  27. http://indianquizleague.blogspot.nl/
  28. http://beginalucrativeonlinebusiness.com/
  29. http://www.teenschoolgirlsfucking.com/ (adult content)
  30. http://quickmoney365.com/
  31. http://www.orthodoxmonasteryicons.com/
  32. http://codecpack.co/
  33. http://www.filecluster.com/
  34. http://www.dominicancooking.com/
  35. http://pk.ilookstyle.com/
  36. http://www.ryojeo.com/2013/08/ciptojunaedy.html

Hypotheses

  • Websites with content duplicated on other pirate websites are penalized.
  • Websites with little or no original or badly written content are penalized (thin content vs plain content ratio).
  • Websites with aggregated content have been penalized.
  • Sites having a bad backlink profile have been penalized.
  • Sites having outbound links to murky site or link farms have been penalized.
  • Ad density is an issue.
  • Google has decided to promote brand websites.
  • This is a follow-up update to the August 21st/22nd update, at a broader scale or deeper level.
  • An update has been posted and contains a bug (or a complex, unanticipated and undesirable side effect).

Analysis

Using collected information and data gathered in the group spreadsheet:
  • Average drop in traffic is around 72%
  • No one reports use of black hat techniques
  • 12,8% report use of grey hat techniques
  • 23,1% report impact before 3rd/4th September
  • 7,7% have an EMD
  • 17,9% had a couple of 404 or server errors
  • 17,9% are not using AdSense
  • 30,8% admit thin content
  • 38,5% admit duplicate content
  • 25,6% admit aggregate content
  • 15,4% admit automatically generated content
  • 64,1% admit thin or duplicate or aggregate or automatically generated content
  • The range of backlinks is 10 to 5.9 millions
  • The range of indexed pages is 45 to 12 millions
The spreadsheet sample contains only 39 entries, which is small.
  1. The broad range for the number of backlinks seems to rule out a pure backlink (quality or amount) issue.
  2. The broad range of indexed pages points at a quality issue, rather than a quantity issue.
  3. More than 92% do not have an EMD, so this rules out a pure exact domain name issue.
  4. More than 82% did not have server or 404 issues, so this rules out them as the main cause for a quality issue.
  5. 17,9% are not using AdSense, meaning this cannot be a 'thin content above the fold' or 'too many ads above the fold' issue only.
  6. Some brand websites have been impacted. Therefore, it does not seem like Google tries to promote them over non-brand websites.
  7. Domain age, country or language are not discriminating factors.

    Best Guess

    By taking a look at the list of impacted websites and the information gathered so far, it seems like we are dealing with a Panda update where sites are delisted or very severely penalized in search rankings because of quality issues.

    These are likely due to thin content, lack of original content, duplicate content, aggregate content or automatically generated content, or a combination of these. It seems like a threshold may have been reached for these sites, triggering the penalty or demotion.

    Regarding duplicate content, there is no evidence confirming for sure that penalties have been triggered because a 3rd party website stole one's content. More than 60% do not report duplicate content issues.

    To summarize it, the September 4th culprit seems to be a high thin or lack of original content ratio issue, leading to an overall lack of high quality content, leading to a lack of trust and authority in one's specific space.

    Unfortunately, Google has a long history of applying harsh mechanical decisions on websites without providing any specific explanation. This leaves people guessing what is wrong with their websites. Obviously, many of the impacted websites are not products of hackers or ill willed people looking for a 'I win - Google looses' relationship.

    Some notifications could be be sent in advance to webmasters who have registered to Google Webmaster Tools. If webmasters do so, it can only mean they are interested in being informed (and not after the facts). This would also give them an opportunity to solve their website issues and work hand-in-hand with Google. So far, there is no opportunity or reward system to do so.

    Possible Solutions

    Someone from the Network Empire claims that Panda is purely algorithmic and that it is run from time to time. If this is true, then this might explain why no one received any notifications or manual penalty in Google Webmaster Tools, and why no one will.

    Google might just be waiting for people to correct issues on their websites and will 'restore' these sites when they pass the Panda filter again. The up side is that this update may not be as fatal as it seems to be.

    Assuming the best guess is correct, the following would help solving or mitigating the impact of this September 4th update:
    • Re-read Dr. Meyers' post about Fat Panda & Thin content.
    • Thin content pages should be marked as noindex (or removed from one's website) or merged into plain/useful/high quality content pages for users.
    • Low quality content (lots of useless text) pages should preferably be removed from the website, or at least be marked as noindex.
    • Internal duplicate content should be eliminated by removing duplicate pages or by using rel="canonical" (canonical pages).
    • Content aggregated from other websites is not original content. Hence, removing these pages can only help (or at least, these page should be marked as noindex).
    • Not enough valuable content above the fold should be solved by removing excessive ads, if any.
    • Old pages not generating traffic should be marked as noindex (or removed).
    • Outbound links to bad pages should be removed (or at least marked as nofollow), especially if they do not contribute to good user experience. This helps restore credibility and authority.
    • Disavow incoming links from dodgy or bad quality websites (if any). One will loose all PageRank benefit from those links, but it will improve their reputation.
    • Regarding Panda, it is known (and I'll post the link when I find it again) that one bad quality page can impact a whole website. So being diligent is a requirement.
    • Lorel says that he has seen improvement on his client websites after de-optimizing and removing excessive $$$ keywords.
    Something to remember:
    • Matt Cutts has confirmed that noindex pages can accumulate and pass PageRank. Therefore, using noindex may be more interesting than removing a page, especially if it has accumulated PageRank and if it has links to other internal pages.

    Friday, 30 August 2013

    Encode & Decode URL Parameters In Java

    A small code example describing how to encode and decode URL query strings properly from Java. The code example is available from GitHub in the in the Java-Core-Examples directory.
    String paramValue = "with spaces and special &!=; characters";
    
    String encoded = URLEncoder.encode(paramValue, "UTF-8");
    String decoded = URLDecoder.decode(paramValue, "UTF-8");
    
    System.out.println(paramValue);
    System.out.println(encoded);
    System.out.println(decoded);
    
    The output is:
    with spaces and special &!=; characters
    with+spaces+and+special+%26%21%3D%3B+characters
    with spaces and special &!=; characters
    

    Thursday, 29 August 2013

    FreeMarker Reminders

    A small post to aggregate notes related to FreeMarker.

    Accessing Parameter Inside A Directive

    The following piece of code:
        <#assign myVar = ${myValue}-1>
    
    will trigger the following exception:
        Exception in thread "main" freemarker.core.ParseException:
          Encountered "{" ...
    
    One should not use ${...} inside a FreeMarker directive, but rather:
        <#assign myVar = myValue - 1 >
    
    See StackOverflow question.

    How To Test A Boolean Value

    This is how to test the enableAds boolean parameter value:
        <#if enableAds>
            ${AdsTopLeft}
        <#else>
            <img src="./incl/AdsSubstitute160-600.jpg">
        </#if>
    

    Miscellaneous

    Testing a value (if)Does a parameter/variable exist?

    Thursday, 18 April 2013

    Calling Javascript From Java (Example)

    This post explains how to call Javascript from Java. The code example is available at GitHub in the Java-Javascript-Calling directory.

    Code Example

    We create some small Javascript code in a string. This code dumps "Hello World!" and defines a function called myFunction() adding 3 to the provided value:
    public class JavascriptCalling {
    
        public static final String MY_JS
            = "print(\"Hello world!\\n\");"
            + "var myFunction = function(x) { return x+3; };";
    
        public static void main(String[] args)
                throws ScriptException, NoSuchMethodException {
    
            // Retrieving the Javascript engine
            ScriptEngine se = new ScriptEngineManager()
                .getEngineByName("javascript");
    
            if ( Compilable.class.isAssignableFrom(se.getClass()) ) {
    
                // We can compile our JS code
                Compilable c = (Compilable) se;
                CompiledScript cs = c.compile(MY_JS);
    
                System.out.println("From compiled JS:");
                cs.eval();
    
            } else {
    
                // We can't compile our JS code
                System.out.println("From interpreted JS:");
                se.eval(MY_JS);
    
            }
    
            // Can we invoke myFunction()?
            if ( Invocable.class.isAssignableFrom(se.getClass()) ) {
    
                Invocable i = (Invocable) se;
                System.out.println("myFunction(2) returns: "
                    + i.invokeFunction("myFunction", 2));
    
            } else {
    
                System.out.println(
                    "Method invocation not supported!");
    
            }
    
        }
    
    }
    The above starts by retrieving the Javascript engine. Next, if one can compile the Javascript, our code is compiled. The script is executed, which prints "Hello World!". Last, if the engine allows invocation, we call our function defined in our Javascript code.

    The result is: