Chapter 3.  cURL Integration

3.1. DocumentBurster / cURL sample scripts
3.1.1. curl_ftp.groovy
3.1.2. curl_sftp.groovy

The current chapter is related with both of the previously presented topics

This chapter is related with the previous two topics in the sense that it shows how to use DocumentBurster scripting as a means of achieving very specific (non-standard) report distribution requirements.

DocumentBurster closely integrates with cURL, a Swiss-army knife for doing data transfer. Through cURL, DocumentBurster can distribute reports via HTTP or FTP with or without authentication, it works over SSL, and it works without interaction. Actually cURL (and thus DocumentBurster ) supports distributing files and data to a various range of common Internet protocols, currently including HTTP, HTTPS, FTP, FTPS, SCP, SFTP, TFTP, LDAP, LDAPS, DICT, TELNET, FILE, IMAP, POP3, SMTP and RTSP.

cURL - http://curl.haxx.se/

Cross platform

cURL is portable and works on many platforms, including Windows, Linux, Mac OS X, MS-DOS and more.

On Windows, DocumentBurster package distribution is bundling together a recent version of cURL. So, if your organization is running DocumentBurster under Windows, there is nothing more to download or install in regards with cURL.

For other UNIX like systems, such as Linux and Mac OS X, the appropriate cURL binaries distribution should be properly downloaded and installed. In addition, the cURL groovy scripts which are bundled together with DocumentBurster are written for Windows usage and should support small adjustments to be made ready for usage under Linux/UNIX.

Command line cURL examples

cURL is a tool for getting or sending files using URL syntax. The URL syntax is protocol-dependent. Along with the URL for the required protocol, cURL can take some additional options in the command line.

For complete cURL documentation you can follow

Following are some sample cURL invocations to upload a file to a remote server (from cURL manual)

1. FTP / FTPS / SFTP / SCP

Upload data from a specified file, login with user and password

curl -T uploadfile -u user:passwd ftp://ftp.upload.com/myfile

Upload a local file to the remote site, and use the local file name remote too

curl -T uploadfile -u user:passwd ftp://ftp.upload.com/

cURL also supports ftp upload through a proxy, but only if the proxy is configured to allow that kind of tunneling. If it does, you can run cURL in a fashion similar to

curl --proxytunnel -x proxy:port -T localfile ftp.upload.com

--ftp-create-dirs

When integrated with DocumentBurster™ , following cURL option will be of interest

--ftp-create-dirs - (FTP/SFTP) When an FTP or SFTP URL/operation uses a path that doesn't currently exist on the server, the standard behavior of cURL is to fail. Using this option, cURL will instead attempt to create missing directories.

2. HTTP

Upload data from a specified file

curl -T uploadfile http://www.upload.com/myfile

Note that the http server must have been configured to accept PUT before this can be done successfully.

Debugging and tracing cURL - VERBOSE / DEBUG

If cURL fails where it isn't supposed to, if the servers don't let you in, if you can't understand the responses: use the -v flag to get verbose fetching. cURL will output lots of info and what it sends and receives in order to let the user see all client-server interaction (but it won't show you the actual data).

curl -v ftp://ftp.upload.com/

To get even more details and information on what cURL does, try using the --trace or --trace-ascii options with a given file name to log to, like this

curl --trace trace.txt www.haxx.se

While it is great to know that so many protocols are supported, DocumentBurster is coming with sample scripts to do cURL report distribution through the most commonly used protocols such as FTP, SFTP and FILE. Any other cURL supported protocol should be achievable by doing little changes to the scripts which are provided in the default DocumentBurster package distribution.

curl_ftp.groovy script is an alternative to the FTP Upload GUI capability which was introduced in DocumentBurster™ User Guide. While through the GUI it is possible to achieve common FTP report distribution use cases, using this FTP script is recommended for more advanced FTP scenarios which require the full cURL FTP capabilities. For example, using this script it is possible to instruct DocumentBurster to automatically create a custom hierarchy of directories on the FTP server, before uploading the reports.

Edit the script scripts/burst/endExtractDocument.groovy with the content found in scripts/burst/samples/curl_ftp.groovy. By default the script is fetching the values for the FTP connection , such as user, password, host and path from the values of $var0$, $var1$, $var2$ and $var3$ user report variables. If the burst reports are configured as such, then there is nothing more to do, and the FTP upload will work without any modification to the script. Otherwise, the FTP script should be modified as per the needs.

While the script might look long, there are actually only few simple lines of active code - most of the content of the script are the comments which are appropriately describing the scope of each section of the script.

					/*
 *
 * 1. This script should be used:
 *
 *      1.1 - As a script to upload reports by FTP using cURL.
 *      1.2 - As a sample and starting script to invoke cURL during the
 *      report bursting life cycle.
 *
 * 2. curl is a tool to transfer data from or to a server, using one of the
 *    supported protocols (DICT, FILE, FTP, FTPS, GOPHER, HTTP, HTTPS, IMAP,
 *    IMAPS, LDAP, LDAPS, POP3, POP3S, RTMP, RTSP, SCP, SFTP, SMTP, SMTPS,
 *    TELNET and TFTP).
 *    
 *    The command is designed to work without user interaction.
 *
 * 3. curl offers a busload of useful tricks like proxy support,
 *    user authentication, FTP upload, HTTP post, SSL connections, cookies,
 *    file transfer resume and more.
 *
 * 4. The URL syntax is protocol-dependent. You'll find a detailed description
 *    in RFC 3986.
 *
 * 5. The script should be executed during the endExtractDocument
 *    report bursting lifecycle phase.
 *
 * 6. Please copy and paste the content of this sample script
 *    into the existing scripts/burst/endExtractDocument.groovy
 *    script.
 *
 * 7. For a full documentation of the cURL and FTP please see
 *
 *      7.1. http://curl.haxx.se/docs/manual.html
 *      7.2. http://curl.haxx.se/docs/manpage.html
 *
 */

import com.sourcekraft.documentburster.variables.Variables

/*
 *
 *    The file to be uploaded is the file which has
 *    just been burst.
 *
 */
def uploadFilePath = ctx.extractFilePath

/*
 *    By default the script is extracting the required FTP 
 *    session information from the following sources:
 *
 *      userName - from the content of $var0$ user variable
 *      password - from the content of $var1$ user variable
 *
 *      hostName - from the content of $var2$ user variable
 *      absolutePath - from the content of $var3$ user variable
 *
 */
def userName = ctx.variables.getUserVariables(ctx.token).get("var0")
def password = ctx.variables.getUserVariables(ctx.token).get("var1")

def hostName = ctx.variables.getUserVariables(ctx.token).get("var2")
def absolutePath = ctx.variables.getUserVariables(ctx.token).get("var3")

/*
 *
 *    $execOptions is the command line to be sent for execution to cURL
 *    - see http://curl.haxx.se/docs/manpage.html
 *
 *    --ftp-create-dirs -
 *
 *      (FTP/SFTP) When an FTP or SFTP URL/operation uses a path that
 *      doesn't currently exist on the server, the standard behavior
 *      of curl is to fail.
 *      Using this option, curl will instead attempt to create the
 *      missing directories.
 *
 *    -T, --upload-file <file>
 *
 *      This transfers the specified local file to the remote URL.
 *      If there is no file part in the specified URL, Curl will
 *      append the local file name.
 *      NOTE that you must use a trailing / on the last directory
 *      to really prove to Curl that there is no file name or curl
 *      will think that your last directory name is	the remote file
 *      name to use. That will most likely cause the upload
 *      operation to fail.
 *      If this is used on a HTTP(S) server, the PUT command
 *      will be used.
 *
 *    -u, --user <user:password>
 *
 *      Specify the user name and password to use for server authentication.
 *
 *    --trace <file>
 *
 *      Enables a full trace dump of all incoming and outgoing data,
 *      including descriptive information, to the given output file.
 *      Use "-" as filename to have the output sent to stdout.
 *      This option overrides previous uses of -v, --verbose or --trace-ascii.
 *      If this option is used several times, the last one will be used.
 *
 *    --trace-ascii <file>
 *
 *      Enables a full trace dump of all incoming and outgoing data,
 *      including descriptive information, to the given output file.
 *      Use "-" as filename to have the output sent to stdout.
 *      This is very similar to --trace, but leaves out the hex part
 *      and only shows the ASCII part of the dump. It makes smaller
 *      output that might be easier to read for untrained humans.
 *      This option overrides previous uses of -v, --verbose or --trace.
 *      If this option is used several times, the last one will be used.
 *
 *    --trace-time
 *
 *      Prepends a time stamp to each trace or verbose line that curl displays.
 *      Added in curl 7.14.0)
 *
 *    -v, --verbose
 *
 *      Makes the fetching more verbose/talkative.
 *      Mostly useful for debugging. A line starting with '>'
 *      means "header data"	sent by curl, '<' means "header data" 
 *      received by curl that is hidden in normal cases, and a 
 *      line starting with '*' means additional info provided
 *      by curl.
 *      Note that if you only want HTTP headers in the output,
 *      -i, --include might be the option you're looking for.
 *      If you think this option still doesn't give you enough details,
 *      consider using --trace or --trace-ascii instead.
 *      This option overrides previous uses of --trace-ascii or --trace.
 *      Use -s, --silent to make curl quiet.
 *
 *    FTPS
 *
 *      It is just like for FTP, but you may also want to specify and use
 *      SSL-specific options for certificates etc.
 *      Note that using FTPS:// as prefix is the "implicit" way as
 *      described in the standards while the recommended "explicit" way is
 *      done by using FTP:// and the --ftp-ssl option.
 *
 *    SFTP / SCP
 *
 *      This is similar to FTP, but you can specify a private key to use
 *      instead of a password.
 *      Note that the private key may itself be protected by a password that is
 *      unrelated to the login password of the remote system.
 *      If you provide a private key file you must also provide a public key file.
 *
 *    For more details see:
 *
 *      1. http://curl.haxx.se/docs/manual.html
 *      2. http://curl.haxx.se/docs/manpage.html
 *
 */
def execOptions =  "--ftp-create-dirs"
execOptions += " -T \"$uploadFilePath\""
execOptions += " -u $userName:$password"
execOptions += " ftp://$hostName/$absolutePath"

def ant = new AntBuilder()

/*
 *    The command executed by curl will be logged in
 *    the logs/DocumentBurster.log file
 */
log.info("Executing command: curl.exe $execOptions")

/*
 *
 *    1. http://groovy.codehaus.org/Executing%20External%20Processes%20From%20Groovy
 *    2. cURL is printing its logging operations to the logs/cURL.log file
 *
 */
ant.exec(
		append: "true",
		failonerror: "true",
		output:"logs/cURL.log",
		executable: 'curl/win/curl.exe') {
			arg(line:"$execOptions")
		}
				

curl_sftp.groovy script can be used to upload the burst reports through Secure File Transfer Protocol or Secure FTP.

With minimum modifications to $execOptions, the script can be adapted to use other protocols such as FTPs or SCP. You can check cURL Manual - cURL usage explained for more details.

Edit the script scripts/burst/endExtractDocument.groovy with the content found in scripts/burst/samples/curl_sftp.groovy. By default the script is fetching the values for the SFTP connection , such as user, password, host and path from the values of $var0$, $var1$, $var2$ and $var3$ user report variables. If the burst reports are configured as such, then there is nothing more to do, and SFTP uploading will work without any additional modification to the script. Otherwise, this script should be modified as per the needs.

While the script might look long, there are actually only few simple lines of active code - most of the content of the script are the comments which are appropriately describing the scope of each section of the script.

					/*
 *
 * 1. This script should be used:
 *
 *      1.1 - As a script to upload reports by SFTP using cURL.
 *      1.2 - As a sample and starting script to invoke cURL during the
 *      report bursting life cycle.
 *
 * 2. curl is a tool to transfer data from or to a server, using one of the
 *    supported protocols (DICT, FILE, FTP, FTPS, GOPHER, HTTP, HTTPS, IMAP,
 *    IMAPS, LDAP, LDAPS, POP3, POP3S, RTMP, RTSP, SCP, SFTP, SMTP, SMTPS,
 *    TELNET and TFTP).
 *    
 *    The command is designed to work without user interaction.
 *
 * 3. curl offers a busload of useful tricks like proxy support,
 *    user authentication, FTP upload, HTTP post, SSL connections, cookies,
 *    file transfer resume and more.
 *
 * 4. The URL syntax is protocol-dependent. You'll find a detailed description
 *    in RFC 3986.
 *
 * 5. The script should be executed during the endExtractDocument
 *    report bursting lifecycle phase.
 *
 * 6. Please copy and paste the content of this sample script
 *    into the existing scripts/burst/endExtractDocument.groovy
 *    script.
 *
 * 7. For a full documentation of the cURL and FTP please see
 *
 *      7.1. http://curl.haxx.se/docs/manual.html
 *      7.2. http://curl.haxx.se/docs/manpage.html
 *
 */

import com.sourcekraft.documentburster.variables.Variables


/*
 *
 *    The file to be uploaded is the file which has
 *    just been burst.
 *
 */
def uploadFilePath = ctx.extractFilePath

/*
 *    By default the script is extracting the required SFTP 
 *    session information from the following sources:
 *
 *      userName - from the content of $var0$ user variable
 *      password - from the content of $var1$ user variable
 *
 *      hostName - from the content of $var2$ user variable
 *      absolutePath - from the content of $var3$ user variable
 *
 */
def userName = ctx.variables.getUserVariables(ctx.token).get("var0")
def password = ctx.variables.getUserVariables(ctx.token).get("var1")

def hostName = ctx.variables.getUserVariables(ctx.token).get("var2")
def absolutePath = ctx.variables.getUserVariables(ctx.token).get("var3")

/*
 *
 *    $execOptions is the command line to be sent for execution to cURL
 *    - see http://curl.haxx.se/docs/manpage.html
 *
 *    --ftp-create-dirs -
 *
 *      (FTP/SFTP) When an FTP or SFTP URL/operation uses a path that
 *      doesn't currently exist on the server, the standard behavior
 *      of curl is to fail.
 *      Using this option, curl will instead attempt to create 
 *      missing directories.
 *
 *    -T, --upload-file <file>
 *
 *      This transfers the specified local file to the remote URL.
 *      If there is no file part in the specified URL, Curl will
 *      append the local file name.
 *      NOTE that you must use a trailing / on the last directory
 *      to really prove to Curl that there is no file name or curl
 *      will think that your last directory name is	the remote file
 *      name to use. That will most likely cause the upload
 *      operation to fail.
 *      If this is used on a HTTP(S) server, the PUT command
 *      will be used.
 *
 *    -u, --user <user:password>
 *
 *      Specify the user name and password to use for server authentication.
 *
 *    --trace <file>
 *
 *      Enables a full trace dump of all incoming and outgoing data,
 *      including descriptive information, to the given output file.
 *      Use "-" as filename to have the output sent to stdout.
 *      This option overrides previous uses of -v, --verbose or --trace-ascii.
 *      If this option is used several times, the last one will be used.
 *
 *    --trace-ascii <file>
 *
 *      Enables a full trace dump of all incoming and outgoing data,
 *      including descriptive information, to the given output file.
 *      Use "-" as filename to have the output sent to stdout.
 *      This is very similar to --trace, but leaves out the hex part
 *      and only shows the ASCII part of the dump. It makes smaller
 *      output that might be easier to read for untrained humans.
 *      This option overrides previous uses of -v, --verbose or --trace.
 *      If this option is used several times, the last one will be used.
 *
 *    --trace-time
 *
 *      Prepends a time stamp to each trace or verbose line that curl
 *      displays.
 *      Added in curl 7.14.0)
 *
 *    -v, --verbose
 *
 *      Makes the fetching more verbose/talkative.
 *      Mostly useful for debugging. A line starting with '>'
 *      means "header data"	sent by curl, '<' means "header data" 
 *      received by curl that is hidden in normal cases, and a 
 *      line starting with '*' means additional info provided by curl.
 *      Note that if you only want HTTP headers in the output,
 *      -i, --include might be the option you're looking for.
 *      If you think this option still doesn't give you enough details,
 *      consider using --trace or --trace-ascii instead.
 *      This option overrides previous uses of --trace-ascii or --trace.
 *      Use -s, --silent to make curl quiet.
 *
 *    FTPS
 *
 *      It is just like for FTP, but you may also want to specify and use
 *      SSL-specific options for certificates etc.
 *      Note that using FTPS:// as prefix is the "implicit" way as
 *      described in the standards while the recommended "explicit" way is
 *      done by using FTP:// and the --ftp-ssl option.
 *
 *    SFTP / SCP
 *
 *      This is similar to FTP, but you can specify a private key to use
 *      instead of a password.
 *      Note that the private key may itself be protected by a password that is
 *      unrelated to the login password of the remote system.
 *      If you provide a private key file you must also provide
 *      a public key file.
 *
 *    For more details see:
 *
 *      1. http://curl.haxx.se/docs/manual.html
 *      2. http://curl.haxx.se/docs/manpage.html
 *
 */
def execOptions =  "-T \"$uploadFilePath\""
execOptions += " -u $userName:$password"
execOptions += " sftp://$hostName/$absolutePath"

def ant = new AntBuilder()

/*
 *
 *    The command executed by curl will be logged in
 *    the logs/DocumentBurster.log file 
 *
 */
log.info("Executing command: curl.exe $execOptions")

/*
 * 
 *    1. http://groovy.codehaus.org/Executing%20External%20Processes%20From%20Groovy
 *    2. cURL is printing its logging operations to the logs/cURL.log file
 *   
 */
ant.exec(
		append: "true",
		failonerror: "true",
		output:"logs/cURL.log",
		executable: 'curl/win/curl.exe') {
			arg(line:"$execOptions")
		}