Automating Adobe InDesign 2019 - AppleScript


Reading Files As Lists, Deeper Dive

Tuesday, June 04, 2019

Reading Files As Lists, Deeper Dive

In previous blog posts we spent a fair amount of time working with reading and writing files, and for good reason. A text file, a tab/return (.tsv), or comma/return (.csv) file can serve as the core source of information for an automation workflow.

Reading a File As Class

A parameter you may want to include in your read file handlers is one that defines how the file will be read. The class of the as parameter of a read command can make the difference between your being able to use a file provided by a client. If you don't specify the as (class) parameter, the file is read as text. But you may need to read the file as UTF-8 or UTF-16 to interpret those upper-level characters.

One handler you might want to add to your library of handlers is the following that reads a tab/return delimited file as a list of lists using Unicode 8.

(*Reads file in as a list of lists*)
on readAsListsofList(myFile)
	set processList to {}
	set fileRef to open for access (myFile)
	set myEOF to get eof fileRef
	set oldDelim to AppleScript's text item delimiters
	if myEOF > 0 then
		set AppleScript's text item delimiters to {"	"} --tab inside quotes
		set textList to read fileRef as «class utf8» using delimiter {return}
		close access fileRef
		repeat with i from 1 to length of textList
			set myData to item i of textList
			set end of processList to text items of myData
		end repeat
		set AppleScript's text item delimiters to oldDelim
	else
		error "Text file is empty"
	end if
	return processList
end readAsListsofList 

Read And Write As List

You can also use date or list for the class but this is only useful if the data was written using a write statement specifying the same class value as its as parameter. The following is an example of how to write and then read a file using the as list parameter. This readAsList handler is a simplified version and should only be used when you know the file in question is not empty and has been written using the as list parameter.

set myList to {"1234", "2222", "3333", "4567"}
set userPath to path to desktop from user domain as string
set filePath to userPath & "testFile.txt"
set fileRef to open for access file filePath with write permission
set myData to write myList to fileRef as list
close access fileRef
set myList to readAsList(filePath)

on readAsList(filePath)
	set fileList to read file filePath as list
	return fileList
end readAsList

List Considerations

When passing a list to a handler, AppleScript actually creates a reference to the list rather than the list itself. Using a reference to is much more efficient but can cause some problems if you are not aware. For instance, if you make a change to the list in the handler, the original list can also be changed.

This is how the AppleScript reference describes the difference:

Passing by Reference Versus Passing by Value

Within a handler, each parameter is like a variable, providing access to passed information. AppleScript passes all parameters by reference, which means that a passed variable is shared between the handler and the caller, as if the handler had created a variable using the set command. However, it is important to remember a point raised in Using the copy and set Commands: only mutable objects (those whose class is date, list, record, or script) can actually be changed.

As a result, a parameter’s class type determines whether information is effectively passed by value or by reference:

For mutable objects, information is passed by reference: If a handler changes the value of a parameter of this type, the original object is changed.

For all other class types, information is effectively passed by value: Although AppleScript passes a reference to the original object, that object cannot be changed. If the handler assigns a new value to a parameter of this type, the original object is unchanged.

If you want to pass by reference with a class type other than date, list, record, or script, you can pass a reference object that refers to the object in question. Although the handler will have access only to a copy of the reference object, the specified object will be the same. Changes to the specified object in the handler will change the original object, although changes to the reference object itself will not.

Some Examples

A few examples can demonstrate the difference:

set myList to {1234, 2222, 3333, 4567}
set newList to incrementMyList(myList)
{newList, myList}

on incrementMyList(aList)
	set newList to {}
	repeat with i from 1 to length of aList
		set item i of aList to ((item i of aList) + 1)
		set end of newList to item i of aList
	end repeat
	return newList
end incrementMyList

The result here is that the items in the original list are also incremented.

This can be corrected in a number of ways. Just be aware that the value of the receiving variable in the handler is not a new variable but acts as a reference to the original.

Notice that values such as strings and numbers do not have the same pass by reference behavior. Although actually passed to the handler by reference, all immutable classes (including strings and numbers) cannot be changed so remain as originally defined. 

set fName to "John"
set lName to "Jones"
set fullName to getFullName(lName, fName)
{fullName, fName}

on getFullName(a, b)
	set b to "Mr. " & b
	set fullName to b & " " & a
	return fullName
end getFullName

Try the same thing only this time using a list. You will see that the value for the second item in the original list has also become "Mr. John"

set nameParts to {"Jones", "John"}
set fullName to getFullName(nameParts)
{fullName, item 2 of nameParts}

on getFullName(a)
	set item 2 of a to "Mr. " & (item 2 of a)
	set fullName to item 2 of a & " " & item 1 of a
	return fullName
end getFullName

When working with large lists (as in reading in a file) you will find it is more efficient to use the a reference to operator. For instance, the following example provided by Apple uses time of (current date) to demonstrate how long it takes to enter 10000 items into a list.

set bigList to {}
set numItems to 10000
set t to (time of (current date)) --Start timing operations
repeat with n from 1 to numItems
    copy n to the end of bigList
    -- DON'T DO THE FOLLOWING--it's even slower!
    -- set bigList to bigList & n
end
set total to (time of (current date)) - t --End timing

On a fast machine, the 2 seconds it takes for the process may not be limiting, but on a slower machine you may not want to try it.

On the other hand, using a reference to makes working with a big list bearable even on a slower computer.

set bigList to {}
set bigListRef to a reference to bigList
set numItems to 10000
set t to (time of (current date)) --Start timing operations
repeat with n from 1 to numItems
	copy n to the end of bigListRef
end repeat
set myNow to (time of (current date))
set total to myNow - t --End timing
{t, myNow}

On a fast machine, the same process above using a reference to is so fast that there is no difference in milliseconds between the results for myNow and t.

You may also consider using a reference to when accessing items within a huge list. Here again, there is an appreciable performance difference.

set bigList to {}
set bigListRef to a reference to bigList
set numItems to 10000
repeat with n from 1 to numItems
	copy n to the end of bigListRef
end repeat

set numItems to 5000
set t to (time of (current date))
repeat with n from 1 to numItems
	item n of bigList
end repeat
set myNow to time of (current date)
set total to myNow - t
total

Change the above to use a reference to and see the difference.

set numItems to 5000
set bigListRef to a reference to bigList
set t to (time of (current date))
repeat with n from 1 to numItems
	item n of bigListRef
end repeat
set myNow to time of (current date)
set total to myNow - t
{myNow, t}

Onward and Upward

The next time a user sends you a "tsv" or a "csv" file, consider creating a script for automating the information using a list or a list of lists. You may find yourself getting a "hero" badge for your efforts.

Disclaimer:
Scripts provided are for demonstration and educational purposes. No representation is made as to their accuracy or completeness. Readers are advised to use the code at their own risk.

Trackback Link
http://www.yourscriptdoctor.com/BlogRetrieve.aspx?BlogID=18424&PostID=1529226&A=Trackback
Trackbacks
Post has no trackbacks.

Recent Posts


Tags


Archive