Skip to content

Issue with deletion of documents. #3

@peetkes

Description

@peetkes

Hi,

I am testrunning the taskbot to be able to delete documents wit taskbot.
I used the sample from the docs to create 1 million docs. I can see this filling up the tskqueue.

Insert documents:

xquery version "1.0-ml";

import module namespace tb = "ns://blakeley.com/taskbot" at "/taskbot.xqy";

(: This will create 1 million docs in block sizes of 500, so it will create 2000 tasks on the taskserver queue :)
tb:list-segment-process(
(: Total size of the job. :)
1 to 1000 * 1000,
(: Size of each segment of work. :)
500,
"/test/asset",
(: This anonymous function will be called for each segment. :)
function($list as item()+, $opts as map:map?) {
(: Any chainsaw should have a safety. Check it here. :)
tb:maybe-fatal(),
for $i in $list
return xdmp:document-insert(
"/test/asset/"||$i,
element asset {
attribute id { 'asset'||$i },
element asset-org { 1 + xdmp:random(99) },
element asset-person { 1 + xdmp:random(999) },
(1 to xdmp:random(9))
! element asset-ref { xdmp:random(1000) } }),
xdmp:commit()
},
(: options - not used in this example. :)
map:new(map:entry('testing', '123...')),
(: This is an update, so be sure to say so. :)

update

)

Delete documents:
xquery version "1.0-ml";
import module namespace tb = "ns://blakeley.com/taskbot" at "/taskbot.xqy";
let $uris := cts:uris()
tb:list-segment-process(
(: Total size of the job. :)
$uris,
(: Size of each segment of work. :)
500,
"Delete Documents",
(: This anonymous function will be called for each segment. :)
function($list as item()+, $opts as map:map?) {
(: Any chainsaw should have a safety. Check it here. :)
tb:maybe-fatal(),
for $uri in $list
return xdmp:document-delete($uri),
xdmp:sleep(500),
xdmp:commit()
},
(: options - not used in this example. :)
map:new(map:entry('testing', '123...')),
(: This is an update, so be sure to say so. :)
$tb:OPTIONS-UPDATE
)

When I execute the deletion, I can see about half of the documents get deleted!!
I also get errors stating
2015-12-23 12:55:16.183 Notice: TaskServer: XDMP-AS: (err:XPTY0004) $list as item()+ -- Invalid coercion: () as item()+
2015-12-23 12:55:16.183 Notice: TaskServer: in /taskbot.xqy, at 294:37,
2015-12-23 12:55:16.183 Notice: TaskServer: in function() as item()*() [1.0-ml]
2015-12-23 12:55:16.183 Notice: TaskServer: in /taskbot.xqy [1.0-ml]
2015-12-23 12:55:16.279 Info: App-Services: ()

This would mean that during execution the list of uris will shrink according to the nr of documents that get deleted. Even surrounding the cts:uris() call with cts:eager will not resolve this.

I got this resolved with the help of Geert Josten who suggested the following:
wrap the $uris (first parameter of the list-segment-process) in json:array-values(json:to-array($uris))

I also noted the following when I tried the $OPTIONS-SYNC-UPDATE parameter.
When I use the $OPTIONS-SYNC-UPDATE parameter, the taskservers queue will never fill up like it does when I tried this with OPTIONS-UPDATE. It even does not make use of all available threads on the taskserver.

Why is that?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions