sabato 30 ottobre 2010

Login to a website using Apache HttpClient and Jericho HTML Parser

The fastest way to create a program is to use code that’s already written: a library. Recently I got the need to build a HTTP-aware client application in Java. This application had to login to a website, download files automatically and logout. It was very important that the program was able to parse the HTML content to find the login form, execute the login in a correct way and then find the links to download files. I decided to use Apache HttpClient to access resources via HTTP and Jericho HTML Parser to parse the HTML content. In this article I will share the most important functions I'm using. These functions show the interrelation we can build between HttpClient and Jericho to achieve the original goal.

First of all, we need to import the libraries:
// Apache HttpClient
import org.apache.http.HttpEntity;
import org.apache.http.HttpResponse;
import org.apache.http.NameValuePair;
import org.apache.http.client.entity.UrlEncodedFormEntity;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.client.ResponseHandler;
import org.apache.http.cookie.Cookie;
import org.apache.http.impl.client.BasicResponseHandler;
import org.apache.http.impl.client.DefaultHttpClient;
import org.apache.http.message.BasicNameValuePair;
import org.apache.http.protocol.HTTP;

// Jericho HTML Parser
import net.htmlparser.jericho.*;
Now we can write down a first, useful function to retrieve login data. Just a little preamble:

The traditional HTTP login form is something like this:

<form method="POST" action="action">
<input type="text" name="username" value="Username" /> 
<input type="password" name="password" value="Password" />
...
<\form>

It's important to notice that usually the form uses name/value pairs we have to deal with.

The buildNameValuePairs function parses the HTML document and returns a dynamic List<NameValuePair> encapsulating name/value pairs to login to a website. The first step in parsing a HTML document  is to construct a Source object from the source data, which can be a String, Reader, InputStream, URLConnection or URL. In this case, the function doesn't construct a new Source object by loading the content directly from the URL because it must remain inside the connection created by HttpClient. We don't need to create another connection and waste useful resources.
public static List<NameValuePair> buildNameValuePairs(
                                       InputStream is,
                                       String strUsername,
                                       String strPassword)
                                       throws Exception  {
        // Initialize the list
        List<NameValuePair> nvps = 
                      new ArrayList<NameValuePair>();

        // string arrays containing the column labels 
        // and values
        String[] columnLabels, columnValues = null;

        // Register the tag types related to the Mason
        // server platform.
        MasonTagTypes.register();

        // Construct a new Source object by loading the
        // content from the specified InputStream.
        Source source = new Source(is);

        // Set the Logger that handles log messages to null.
        // We are a windowed application and we don't
        // need to handle verbose outputs.
        source.setLogger(null);

        // Parses all of the tags in this source document
        // sequentially from beginning to end.
        source.fullSequentialParse();

        // Return a list of all forms in the 
        // source document.
        List<Element> formElements = 
                                     source.getAllElements(
                                      HTMLElementName.FORM);

        // Start the loop
        for (Element formElement : formElements) {
            String loginTag = 
                       formElement.getStartTag().toString();
            // stop the execution of the current iteration 
            // and go back to the beginning of the loop to 
            // begin a new iteration
            if (loginTag == null) continue;
            // Let's find the login form.
            else if (
              loginTag.toLowerCase().indexOf("login")>-1) {
                // Create a segment of the Source document 
                // containing the login form as CharSequence.
                CharSequence cs = formElement.getContent();
                // Constructs a new Source object from 
                // the specified segment.
                source = new Source(cs);

                // Return a collection of FormField objects.
                // Each form field consists of a group of 
                // form controls having the same name.
                FormFields formFields = 
                                     source.getFormFields();

                // Return a string array containing 
                // the column labels corresponding to the 
                // values from the getColumnValues(Map)
                // method.
                columnLabels = formFields.getColumnLabels();
                // Convert all the form submission values 
                // of the constituent form fields into a 
                // simple string array.
                columnValues = formFields.getColumnValues();

                break;
            }
        }

        /**
         * Now we can construct the List of name/value pairs
         * to login to a website.
         */
        for (int i = 0; i < columnValues.length; i++)
            nvps.add(new BasicNameValuePair(columnLabels[i],
                                          columnValues[i]));
        
        // if (columnLabels[i].equalsIgnoreCase("username")){
        // nvps.add(new BasicNameValuePair(columnLabels[i],
        //                                 strUsername)); 

        // return statement
        return nvps;    
}

Now we can build the master function that will establish a connection and login using the buildNameValuePairs function. We can pass 3 variables as arguments to the function: the URL to execute the login as String, username and password.

public static boolean loginExecuted(String strDomainUrl, 
                                    // user-supplied
                                    String strUsername, 
                                    // user-supplied
                                    String strPassword) 
                                    // user-supplied
                                    throws Exception {
        boolean bSuccess = false;

        // Create a new HTTP client and a
        // connection manager
        DefaultHttpClient httpclient = 
                                new DefaultHttpClient();

        // The GET method means retrieve whatever
        // information (in the form of an entity) is
        // identified by the Request-URI. If the Request-URI
        // refers to a data-producing process, it is the
        // produced data which shall be returned as the 
        // entity in the response and not the source text 
        // of the process, unless that text happens to be 
        // the output of the process.
        HttpGet httpget = new HttpGet(strDomainUrl);

        HttpResponse response = httpclient.execute(httpget);

        // An entity that can be sent or received with an
        // HTTP message.
        HttpEntity entity = response.getEntity();

        // The InputStream from the entity
        InputStream instream = entity.getContent();

        // Now we can call the function we built before
        List<NameValuePair> loginNvps = buildNameValuePairs(
                                             instream,
                                             strUsername,
                                             strPassword);

        instream.close();

        if (entity != null)
            entity.consumeContent();

        if (loginNvps.size() > 0) {
            // The post method is used to request that 
            // the origin server accept the entity enclosed  
            // in the request as a new subordinate of the 
            // resource identified by the Request-URI in 
            // the Request-Line. Essentially this means 
            // that the POST data will be stored by the
            // server and usually will be processed 
            // by a server side application.
            HttpPost httpost = new HttpPost(strDomainUrl);

            httpost.setHeader("User-Agent",
                       "Mozilla/5.0 (compatible; MSIE 7.0; 
                                          Windows 2000)");

            httpost.setEntity(
                     new UrlEncodedFormEntity(loginNvps, 
                                              HTTP.UTF_8));

            response = httpclient.execute(httpost);
            entity = response.getEntity();

            if (entity != null)
                entity.consumeContent();
            
            
            // At this point we can handle the connection 
            // as we like. For example we can check out 
            // if procedure was succesful (bSuccess = true), 
            // read the content, download files...
        }

        // return statement
        return bSuccess;
    }
That's all for now (I will check out this post as soon as possible to fix errors).
Best regards.

venerdì 29 ottobre 2010

Netbeans - Prevent FrameView from closing

Preventing FrameView from quitting seems to be such a difficult operation. Browsing the internet and testing all of the suggestions, I found out a way to achieve what I want. Let's immagine that we want to build an application called MyProggy. Netbeans' ll create two files, respectively called MyProggyAPP.java and MyProggyView.java. MyProggyApp is the main class, MyProggyView takes care of creating the frame.
In MyProggyApp.java we can modify the startup void in this way (overriding CanExit() and WillExit methods):
    /**
     * At startup create and show the main frame of the 
     * application.
     */
    @Override protected void startup() {
        show(new MyProggyView(this));

        // Create the ExitListener
        ExitListener exitListener = new ExitListener() {    
            public boolean canExit(EventObject arg0) { 
                // return statement
                return false; 
            }

            public void willExit(EventObject arg0) {}
        };
        // Add the Listener
        addExitListener(exitListener);
    }
After that, in MyProggyView.java we can override the WindowClosing void (and WindowIconified, WindowDeiconified...)
    final JFrame frame = this.getFrame();

    frame.addWindowListener(new WindowAdapter() {
        @Override
        public void windowIconified(WindowEvent evt) {
            // Do something
        }

        @Override        
        public void windowClosing(WindowEvent e) {
            // Do something
        }
     });

That's all, we're done.

Best regards.

domenica 17 ottobre 2010

Google Toolbar API - Custom search buttons

We all know the power of Google Toolbar, what we should know is that this extension is API-based: "The Google Toolbar API lets webmasters create custom buttons for the Google Toolbar (version 4 and above) using XML. Buttons can navigate to and search a site, display an RSS feed in a menu, and change their icon dynamically. Users can add your custom buttons to their Toolbar by clicking on a link on your website or Google's Button Gallery."


Adding custom search button is very easy. Unfortunately, at this moment, the API isn't smart enough to always retrieve the right way to interrogate a search engine. Moreover some website hides the correct way to interact with the search engine. It means that we'll have to edit the button using the Advanced Editor. More precisely we have to edit the tag  <search></search>.


Let's make an example. Let's assume that we want to create a custom button to search this blog or, in general, a blog hosted by Blogspot:
  1. Right click on the search box on the right of the page and select "Generate Custom Search..." from the menu that appears
  2. Click on "Add" and then "OK" in the Custom Button installation dialogs
The button will not work because the  API has built up a wrong URL to interrogate the search engine.
Let's open up the advanced editor and we'll see a tag like this:

<search charset="utf-8">http://stefanobolli.blogspot.com/?search={query}</search>

The URL isn't correct, it should be http://stefanobolli.blogspot.com/search?q={query} or, in general, http://username.blogspot.com/search?q={query}.


Let's consider a more difficult case: Ubuntu Forums.
Usually this kind of Forums uses an URL formatted in this way:

Domain Method to search Search Options
http://ubuntuforums.org/ search.php?do=process&query={query} ie:&showposts=1
ie:&forumchoice[]=id

In this case, we must be logged in to search our terms.

Sometimes we can retrieve the correct URL from the address bar, otherwise we should understand what kind of portal we have to deal with (ie: Joomla).


Best regards

mercoledì 13 ottobre 2010

Compiler Error C2733 - second C linkage of overloaded function 'function' not allowed

If we want to use Microsoft Visual C++ 2010 Express to compile a project that includes old Microsoft Platform SDK headers and libraries from previous Express releases, we have to modify Additional include Directories and Additional library Directories.
Microsoft Visual C++ 2010 updates the old Windows SDK adding a new folder in "C:\Program Files" named Microsoft SDKs. This folder contains new headers and new libs, so if we try to compile an old project with persistent references to the old SDK, Visual C++ 2010 will return a bunch of errors C2733 (in my case "second C linkage of overloaded function '_interlockedbittestandset' not allowed").
As said above, the solution is quite easy: update the paths in Additional include Directories and Additional library Directories. (example, from C:\Program Files\\Microsoft Platform SDK\Include    to C:\Program Files\Microsoft SDKs\Windows\v7.0A\include).

Best regards.

martedì 5 ottobre 2010

Compiling Connection plugin C source code to connect messiah's animation to Maya

I'd like to try out the messiahStudio 4.5 Demo version and connect it with Maya 2011. I downloaded the plugin source code to read it. Actually I didn't end to read it, because tons of code may take a lot of time, especially if we don't have a pale idea of how the program works.

I have Visual Studio C++ 2005 Express (if you want to compile the plugin with Visual c++ 2010 read this post) on my notebook, so I skipped the reading and gone directly to compile the plugin, just to have an idea of what kind of monster I have to deal with
I was able to compile the plugin (requirements, for me, Microsoft Platform SDK) after a few quick adjustments (the source is quite old). The most important:
  • References to file paths need to be resynced
  • #include <iostream> instead of #include <iostream.h>
  • "using namespace std;" in MH_NodeObject.cpp, messiahDeformerNode.cpp, pluginMain.cpp.
  • #include <windows.h> in MH_NodeObject.cpp and MH_System.cpp.
  • preprocessor option _DEBUG has to be removed from the release config
Visual Studio returned a few warnings (ie: #pragma warning(disable : 4996) or, better, _CRT_SECURE_NO_DEPRECATE in c++ preprocessor definitions to hide them, for now) but the project was generated.

The next step: test from Maya 2011. The output:
// messiahXform loaded //
// messiahDform loaded //
// messiahMaya loaded //
// messiah command loaded //
"// messiah started //" should appear here
// Error: source messiah; //
// Error: Cannot find file "messiah" for source statement. //
// Error: Cannot find procedure "pmgCreateMenu". //
// Warning: waitCursor stack empty // <- due to errors above

The errors in red are thrown when the plugin executes these calls to the MGlobal::executeCommand method:

// create the messiah menu
// sprintf(txt,"%s\\%s",path,"messiah.mel");
// MGlobal::sourceFile(txt);
stat = MGlobal::executeCommand("source messiah");
stat = MGlobal::executeCommand("pmgCreateMenu");

MGlobal is a static class which provides access to Maya's model. This class provides  also a method for executing MEL commands from within the Maya API. The plugin searches for a mel script (messiah.mel)  to create a menu.
In confirmation of this, I found out (very hard thing, there's not much about this application) that  Messiah 2.5 "added a new menu to the messiahmayaXX.mll plugins. There is a new MEL script in the main messiah directory named messiah.mel. Now, when the messiahmaya plugin is initialized, a new 'messiah' menu will automatically be added to the main maya menu allowing you at add the xformer, deformer or bring up the messiah interface. You can also change the scale or query the current scale."
Unfortunately there's not a mel script in the package I downloaded from projectmessiah.com.
For now I can't figure out how to fix the problem without the mel script and commenting out the code above is the best choice.
By the way,  a reading of interest  about the argument here.

Mel scripts apart, reading the code I realized that I had to take care  to define the host application name  (Maya) and  the module name (messiahMaya2011.mll) in params.h and in c++ preprocessor definitions. These constants are passed as arguments to a couple of functions (ie: messiahStart(...))
Now, the plugin should load messiahHOST.dll (ok) and then initialize Messiah.  Actually the application starts to load its libraries but, at a certain point, something wrong (code e06d7363: unhandled exception) happens  while executing MH_System::_beginSession():

First-chance exception at  0x00000000 in maya.exe: 0xc0000005: Access violation reading location  0x00000000.

"A first chance exception is basically when an exception occurs and gives your code the "first chance" of handling it in a catch() block. If you don't handle the exception, and no one else does then your code receives an unhandled exception and is exited".

Messiah loads its own library, Filemode.dll, and Maya crashes.



(294.a60): Access violation - code c0000005 (first chance)

First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=00000000 ebx=781c1bf8 ecx=7817ab1f edx=29e22d58 esi=01649779 edi=2d875ff8
eip=00000000 esp=01649740 ebp=000000da iopl=0 nv up ei pl nz na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00210202
00000000 ?? ???
*** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\Programmi\pmG\messiahStudio4.5...Filemode.dll -
0:000> !analyze -v
*******************************************************************************
*                                                         *
* Exception Analysis *
* *
*******************************************************************************

FAULTING_IP:
+0
00000000 ?? ???


EXCEPTION_RECORD: ffffffff -- (.exr ffffffffffffffff)
ExceptionAddress: 00000000
ExceptionCode: c0000005 (Access violation)
ExceptionFlags: 00000000
NumberParameters: 2
Parameter[0]: 00000000
Parameter[1]: 00000000
Attempt to read from address 00000000

FAULTING_THREAD: 00000a60

PROCESS_NAME: maya.exe

OVERLAPPED_MODULE: AnimateMode

DEFAULT_BUCKET_ID: CORRUPT_MODULELIST_OVERLAPPED_MODULE


ERROR_CODE: (NTSTATUS) 0xc0000005 - L'istruzione a "0x%08lx" ha fatto riferimento alla memoria a "0x%08lx". La memoria non poteva essere "%s".

READ_ADDRESS: 00000000

BUGCHECK_STR: ACCESS_VIOLATION

THREAD_ATTRIBUTES:
LAST_CONTROL_TRANSFER:
LAST_CONTROL_TRANSFER: from 2a630ef3 to 00000000

STACK_TEXT:
0164973c 2a630ef3 2d875ff8 781c1bf8 7817775d 0x0
WARNING: Stack unwind information not available. Following frames may be wrong.
fffffffe 00000000 00000000 00000000 00000000 Filemode!InitFunction+0xf2d3

FAILED_INSTRUCTION_ADDRESS:
+0
00000000 ?? ???

FOLLOWUP_IP:
Filemode!InitFunction+f2d3
2a630ef3 8b0db4de7a2a mov ecx,[Filemode!MemSort+0x173084 (2a7adeb4)]

SYMBOL_STACK_INDEX: 1

FOLLOWUP_NAME: MachineOwner

SYMBOL_NAME: Filemode!InitFunction+f2d3

MODULE_NAME: Filemode

IMAGE_NAME: Filemode.dll

DEBUG_FLR_IMAGE_TIMESTAMP: 4b84579a

STACK_COMMAND: ~0s ; kb

FAILURE_BUCKET_ID: ACCESS_VIOLATION_BAD_IP_Filemode!InitFunction+f2d3

BUCKET_ID: ACCESS_VIOLATION_BAD_IP_Filemode!InitFunction+f2d3

Followup: MachineOwner

---------

The ACCESS_VIOLATION_BAD_IP is a bad instruction pointer, we're jumping from 2a630ef3 to an invalid memory address (00000000). We don't have debug symbols for Filemode.dll and we can't run Maya in debug mode on Windows.

I tried to compile the plugin for Alias Maya 6.5 (on the left) and obviously it works fine with Messiah 4.5. Now, without entering useless and boring debug details, the problem shouldn't be the plugin itself, because the only different thing here is Maya.

That's all for now, we'll see.

Best regards.

sabato 2 ottobre 2010

New version of CGPersia Toolbar for Firefox

CGPersia Toolbar 0.1 is under approval control by Mozilla
In the meantime it's possible to download the new version here.
This release is compatible with Firefox 2.0 - 3.6.*.


Changelog:
  • Removed links to no more existing Forums.
  • Added links to new Forums.
  • Added 3 new drop-down buttons to dynamically retrieve Latest Posts, Hottest and Most Viewed Threads.
  • Added an Options Window.
  • Added the possibility to remove the new drop-down buttons through the Options Window.
  • Added the feature to save and reload the search history.
  • Added the file translations.dtd in locale/en-US to store tooltips and labels.
  • The extension now uses a new icon set. 
To-Do List:
  • Advanced search window.
  • A toolbar button for the Blog.
  • A method to drag and drop links onto a target programmed to run external applications (ie: JDownloader).
  • Drop-down button for bookmarked pages.
Best regards.