Pārlūkot izejas kodu

可相对循环内XPath设置点击元素路径

pull/129/head
naibo pirms 1 gada
vecāks
revīzija
ad76eeef24
11 mainītis faili ar 73 papildinājumiem un 48 dzēšanām
  1. +19
    -13
      ElectronJS/src/taskGrid/FlowChart.html
  2. +10
    -12
      ElectronJS/src/taskGrid/FlowChart_CN.html
  3. +3
    -0
      ElectronJS/src/taskGrid/logic.js
  4. +0
    -1
      ElectronJS/tasks/160.json
  5. +1
    -0
      ElectronJS/tasks/173.json
  6. +1
    -0
      ElectronJS/tasks/174.json
  7. +1
    -0
      ElectronJS/tasks/175.json
  8. +1
    -0
      ElectronJS/tasks/176.json
  9. +1
    -0
      ElectronJS/tasks/177.json
  10. +1
    -1
      ExecuteStage/.vscode/launch.json
  11. +35
    -21
      ExecuteStage/easyspider_executestage.py

+ 19
- 13
ElectronJS/src/taskGrid/FlowChart.html Parādīt failu

@ -134,12 +134,12 @@
</div>
<div class="elements" v-if="nodeType==2">
<p><input onkeydown="inputDelete(event)" type="checkbox" v-model='nowNode["parameters"]["iframe"]'></input>Element is inside iframe</p>
<div v-if="nowNode['isInLoop']">
<!-- 如果在循环内才显示此行元素 -->
<p><input onkeydown="inputDelete(event)" type="checkbox" v-model='useLoop'></input>Use element inside the Loop</p>
<p><input onkeydown="inputDelete(event)" type="checkbox" v-model='useLoop'></input>Use element located by xpath relative to the loop</p>
</div>
<p v-if="!useLoop"><input onkeydown="inputDelete(event)" type="checkbox" v-model='nowNode["parameters"]["iframe"]'></input>Element is inside iframe</p>
<div v-if='!useLoop'>
<div>
<label>XPath: <span style="font-size: 30px!important;" title="Relative XPATH writing: start with /, e.g. the loop item XPATH is /html/body/div[1], your input is /*[@id='tab-customer'], then the final addressed xpath is: /html/body/div[1]/*[@id='tab-customer']"></span></label>
<textarea onkeydown="inputDelete(event)" class="form-control" rows="2" v-model='nowNode["parameters"]["xpath"]'></textarea>
<p><button type="button" data-toggle="modal" data-target="#myModal_XPath" @click="changeXPaths(nowNode['parameters']['allXPaths'])" class="btn btn-primary" style="margin-top: 10px">Click here to view other equivalent XPath expressions</button></p>
@ -218,7 +218,7 @@
<label><strong>{{paras.parameters[paraIndex]["name"]}}</strong></label>
<p v-if="nowNode['isInLoop']"><input onkeydown="inputDelete(event)" type="checkbox" v-model='paras.parameters[paraIndex]["relative"]'></input>Use relative XPath</p>
<p v-if='!paras.parameters[paraIndex]["relative"]'><input onkeydown="inputDelete(event)" type="checkbox" v-model='paras.parameters[paraIndex]["iframe"]'></input>Element is inside iframe</p>
<p>XPATH: <span style="font-size: 30px!important;" title="Relative XPATH writing: start with /, e.g. the loop item XPATH is /html/body/div[1], your input is /*[@id='tab-customer'], then the final addressed xpath is: /html/body/div[1]/*[@id='tab-customer']"></span></p>
<p>XPATH (Field["FieldName"] can be used in any XPATHS): <span style="font-size: 30px!important;" title="Relative XPATH writing: start with /, e.g. the loop item XPATH is /html/body/div[1], your input is /*[@id='tab-customer'], then the final addressed xpath is: /html/body/div[1]/*[@id='tab-customer']"></span></p>
<textarea onkeydown="inputDelete(event)" class="form-control" rows="2" v-model='paras.parameters[paraIndex]["relativeXPath"]' placeholder="If you want to write the XPath relative to the current element in the loop, you can write as *../div[1] which matches the first div child element of the parent of the current element in the loop."></textarea>
<p><button type="button" data-toggle="modal" data-target="#myModal_XPath" @click="changeXPaths(paras.parameters[paraIndex]['allXPaths'])" class="btn btn-primary" style="margin-top: 10px">Click here to view other equivalent XPath expressions</button></p>
<p style="margin-top: 10px">
@ -364,13 +364,15 @@
<div v-if='nowNode["parameters"]["codeMode"] < 3 || nowNode["parameters"]["codeMode"] >= 5'>
<label>Code (Use Field["FieldName"] to input the lastest value of a field): </label>
<textarea onkeydown="inputDelete(event)" class="form-control" rows="2" v-model='nowNode["parameters"]["code"]' placeholder="Please input a JavaScript command or a system command. For example, document.body.innerText = '1' is an example of a JavaScript command, and python D:/test.py is an example of a system command. If you choose to execute a JavaScript script for the current iteration, you can represent the element of the current iteration using arguments[0]. For instance, arguments[0].style.color = 'blue' sets the color of the element in the current iteration to blue."></textarea>
<pre class="form-control" style="background: white; margin-top: 20px; min-height: 200px; font-size: 15px!important; word-wrap: break-word; white-space: pre-wrap; border-radius: 0; border: 1px solid" disabled v-if='nowNode["parameters"]["codeMode"] == 5'>This option is an advanced feature that allows direct manipulation of the running browser using Python code. You can also customize variables in the entire execution environment and perform operations such as modifying and assigning values. Here are some examples:
<pre class="form-control" style="background: white; margin-top: 20px; min-height: 200px; font-size: 15px!important; word-wrap: break-word; white-space: pre-wrap; border-radius: 0; border: 1px solid" disabled v-if='nowNode["parameters"]["codeMode"] == 5'>Please read the instructions first and then write the specific code in the input box above (not in this box).
This option is an advanced feature that allows direct manipulation of the running browser using Python code. You can also customize variables in the entire execution environment and perform operations such as modifying and assigning values. Here are some examples:
1. Use `self.browser` to refer to the current browser being operated. You can directly use Selenium's API to perform operations, such as `self.browser.find_element(By.CSS_SELECTOR, "body").send_keys(Keys.END)` to scroll to the bottom.
2. Define a global variable: `self.myVar = 1`
3. Manipulate the above-defined global variable: `self.myVar = self.myVar + 1`
4. Print the above-defined global variable: `print(self.myVar)`
If you want to record your custom variable as a field, please select the next option, "Evaluate Python expressions in the execution environment."</pre>
<pre class="form-control" style="background: white; margin-top: 20px; min-height: 200px; font-size: 15px!important; word-wrap: break-word; white-space: pre-wrap; border-radius: 0; border: 1px solid" disabled v-if='nowNode["parameters"]["codeMode"] == 6'>This option is an advanced feature that allows directly returning the expression value of Python code. Here are some examples:
<pre class="form-control" style="background: white; margin-top: 20px; min-height: 200px; font-size: 15px!important; word-wrap: break-word; white-space: pre-wrap; border-radius: 0; border: 1px solid" disabled v-if='nowNode["parameters"]["codeMode"] == 6'>Please read the instructions first and then write the specific code in the input box above (not in this box).
This option is an advanced feature that allows directly returning the expression value of Python code, and in other places, use Field["FieldName"] to represent the return value of this operation.. Here are some examples:
1. Return relevant values of the current browser object. Use `self.browser` to refer to the current browser being operated. You can directly use Selenium's API to perform operations, such as `self.browser.find_element(By.CSS_SELECTOR, "body").text` to return the text on the current page.
2. Return the value of a custom global variable: `self.myVar`
3. Return the result of a conditional statement: `self.myVar == 1`
@ -430,12 +432,12 @@ Please note that this feature does not support assigning values to variables. In
</div>
<div class="elements" v-if="nodeType==7">
<p><input onkeydown="inputDelete(event)" type="checkbox" v-model='nowNode["parameters"]["iframe"]'></input>Element is inside iframe</p>
<div v-if="nowNode['isInLoop']">
<!-- 如果在循环内才显示此行元素 -->
<p><input onkeydown="inputDelete(event)" type="checkbox" v-model='useLoop'></input>Use element inside the loop</p>
<p><input onkeydown="inputDelete(event)" type="checkbox" v-model='useLoop'></input>Use element located by xpath relative to the loop</p>
</div>
<p v-if='!useLoop'><input onkeydown="inputDelete(event)" type="checkbox" v-model='nowNode["parameters"]["iframe"]'></input>Element is inside iframe</p>
<div v-if='!useLoop'>
<div>
<label>XPath: </label>
<textarea onkeydown="inputDelete(event)" class="form-control" rows="2" v-model='nowNode["parameters"]["xpath"]'></textarea>
<p><button type="button" data-toggle="modal" data-target="#myModal_XPath" @click="changeXPaths(nowNode['parameters']['allXPaths'])" class="btn btn-primary" style="margin-top: 10px">Click here to view other equivalent XPath expressions</button></p>
@ -474,11 +476,13 @@ Please note that this feature does not support assigning values to variables. In
<div v-else-if='parseInt(loopType) < 8'>
<label>Code (Use Field["FieldName"] to input the lastest value of a field):</label>
<textarea onkeydown="inputDelete(event)" class="form-control" rows="3" v-model='nowNode["parameters"]["code"]' placeholder="Continue the loop if the command return value is greater than 0 or evaluates to true; otherwise, stop the loop. For example, return document.body.scrollWidth > 1000 is an example of a JavaScript command return value, and python D:/test.py is an example of a system command return value."></textarea>
<pre class="form-control" style="background: white; margin-top: 20px; min-height: 220px; font-size: 15px!important; word-wrap: break-word; white-space: pre-wrap; border-radius: 0; border: 1px solid" disabled v-if='parseInt(loopType) == 7'>Loop based on the expression value of Python code. Here are some examples:
<pre class="form-control" style="background: white; margin-top: 20px; min-height: 220px; font-size: 15px!important; word-wrap: break-word; white-space: pre-wrap; border-radius: 0; border: 1px solid" disabled v-if='parseInt(loopType) == 7'>Please read the instructions first and then write the specific code in the input box above (not in this box).
Loop based on the expression value of Python code. Here are some examples:
1. Return relevant values of the current browser object. Use `self.browser` to refer to the current browser being operated. You can directly use Selenium's API to perform operations, such as `self.browser.find_element(By.CSS_SELECTOR, "body").text=="123"`, which checks whether the current page contains the text "123".
2. Return the value of a custom global variable: `self.myVar`
3. Return the result of a conditional statement: `self.myVar == 1`
If the expression returns a value greater than 0 or evaluates to True, the loop continues; otherwise, it stops.</pre>
If the expression returns a value greater than 0 or evaluates to True, the loop continues; otherwise, it stops.
</pre>
<label>Maximum wait time for script execution (0 represents unlimited wait time): </label>
<input onkeydown="inputDelete(event)" required class="form-control" type="number" v-model.number='nowNode["parameters"]["waitTime"]'></input>
</div>
@ -544,11 +548,13 @@ If the expression returns a value greater than 0 or evaluates to True, the loop
<div v-else-if='TClass > 0 && TClass < 7 || TClass == 8'>
<label>Code/Script Content: </label>
<textarea onkeydown="inputDelete(event)" class="form-control" rows="3" v-model='nowNode["parameters"]["code"]' placeholder="If the return value is greater than 0 or true, the operations within this branch will be executed; otherwise, they will not be executed. For example: return document.body.scrollWidth > 1000 or python D:/test.py, representing examples of JS command and system command return values."></textarea>
<pre class="form-control" style="background: white; margin-top: 20px; min-height: 200px; font-size: 15px!important; word-wrap: break-word!important; white-space: pre-wrap; border-radius: 0; border: 1px solid" disabled v-if='TClass == 8'>Use the expression value of Python code to determine whether a condition is satisfied. Here are some examples:
<pre class="form-control" style="background: white; margin-top: 20px; min-height: 200px; font-size: 15px!important; word-wrap: break-word!important; white-space: pre-wrap; border-radius: 0; border: 1px solid" disabled v-if='TClass == 8'>Please read the instructions first and then write the specific code in the input box above (not in this box).
Use the expression value of Python code to determine whether a condition is satisfied. Here are some examples:
1. Return relevant values of the current browser object. Use `self.browser` to refer to the current browser being operated. You can directly use Selenium's API to perform operations, such as `self.browser.find_element(By.CSS_SELECTOR, "body").text=="123"`, which checks whether the current page contains the text "123".
2. Return the value of a custom global variable: `self.myVar`
3. Return the result of a conditional statement: `self.myVar == 1`
If the expression returns a value greater than 0 or evaluates to True, the operations within this branch will be executed; otherwise, they will be skipped.</pre>
If the expression returns a value greater than 0 or evaluates to True, the operations within this branch will be executed; otherwise, they will be skipped.
</pre>
<label>Maximum wait time for script execution (0 represents unlimited wait time): </label>
<input onkeydown="inputDelete(event)" required class="form-control" type="number" v-model.number='nowNode["parameters"]["waitTime"]'></input>
</div>

+ 10
- 12
ElectronJS/src/taskGrid/FlowChart_CN.html Parādīt failu

@ -134,12 +134,12 @@
</div>
<div class="elements" v-if="nodeType==2">
<p><input onkeydown="inputDelete(event)" type="checkbox" v-model='nowNode["parameters"]["iframe"]'></input>元素在iframe内</p>
<div v-if="nowNode['isInLoop']">
<!-- 如果在循环内才显示此行元素 -->
<p><input onkeydown="inputDelete(event)" type="checkbox" v-model='useLoop'></input>使用循环内的元素</p>
<p><input onkeydown="inputDelete(event)" type="checkbox" v-model='useLoop'></input>使用相对循环内的XPath定位到的元素</p>
</div>
<p v-if="!useLoop"><input onkeydown="inputDelete(event)" type="checkbox" v-model='nowNode["parameters"]["iframe"]'></input>元素在iframe内</p>
<div v-if='!useLoop'>
<div>
<label>XPath: <span style="font-size: 30px!important;" title="相对XPATH写法:以/开头,如循环项XPATH为/html/body/div[1],您的输入为/*[@id='tab-customer'],则最终寻址的xpath为:/html/body/div[1]/*[@id='tab-customer']"></span></label>
<textarea onkeydown="inputDelete(event)" class="form-control" rows="2" v-model='nowNode["parameters"]["xpath"]'></textarea>
<p><button type="button" data-toggle="modal" data-target="#myModal_XPath" @click="changeXPaths(nowNode['parameters']['allXPaths'])" class="btn btn-primary" style="margin-top: 10px">点此查看其他等价的XPath</button></p>
@ -218,7 +218,7 @@
<label><strong>{{paras.parameters[paraIndex]["name"]}}</strong></label>
<p v-if="nowNode['isInLoop']"><input onkeydown="inputDelete(event)" type="checkbox" v-model='paras.parameters[paraIndex]["relative"]'></input>使用相对循环内的XPATH</p>
<p v-if='!paras.parameters[paraIndex]["relative"]'><input onkeydown="inputDelete(event)" type="checkbox" v-model='paras.parameters[paraIndex]["iframe"]'></input>元素在iframe内</p>
<p>XPath: <span style="font-size: 30px!important;" title="相对XPATH写法:以/开头,如循环项XPATH为/html/body/div[1],您的输入为/*[@id='tab-customer'],则最终寻址的xpath为:/html/body/div[1]/*[@id='tab-customer']"></span></p>
<p>XPath(所有XPath内均写Field["字段名"]表示参数值)<span style="font-size: 30px!important;" title="相对XPATH写法:以/开头,如循环项XPATH为/html/body/div[1],您的输入为/*[@id='tab-customer'],则最终寻址的xpath为:/html/body/div[1]/*[@id='tab-customer']"></span></p>
<textarea onkeydown="inputDelete(event)" class="form-control" rows="2" v-model='paras.parameters[paraIndex]["relativeXPath"]' placeholder="如果要写相对循环内的xpath,可以写如*../div[1]即匹配当前循环元素的父元素的第一个div子元素"></textarea>
<p><button type="button" data-toggle="modal" data-target="#myModal_XPath" @click="changeXPaths(paras.parameters[paraIndex]['allXPaths'])" class="btn btn-primary" style="margin-top: 10px">点此查看其他等价的XPath</button></p>
<p style="margin-top: 10px">
@ -370,15 +370,13 @@
2. 自定义一个全局变量:self.myVar = 1
3. 操纵上面定义的全局变量:self.myVar = self.myVar + 1
4. 打印上面定义的全局变量:print(self.myVar)
如果想要将自己定义的变量作为字段记录,请选择下一个“在执行环境下获得Python表达式值(eval操作)”选项。
</pre>
如果想要将自己定义的变量作为字段记录,请选择下一个“在执行环境下获得Python表达式值(eval操作)”选项。</pre>
<pre class="form-control" style="background: white; margin-top: 20px; min-height: 200px; font-size: 15px!important; word-wrap: break-word; white-space: pre-wrap; border-radius: 0; border: 1px solid" disabled v-if='nowNode["parameters"]["codeMode"] == 6'>请先阅读此说明,再在上方输入框(不是本框)写具体代码。
此选项为高级功能,可以直接返回Python代码的表达式值,示例:
此选项为高级功能,可以直接返回Python代码的表达式值,并在其他位置用Field["本操作名称"]表示此操作返回值,示例:
1. 返回当前浏览器对象的相关值,用self.browser表示当前操作的浏览器,可直接用selenium的API进行操作,如self.browser.find_element(By.CSS_SELECTOR, "body").text即可返回当前页面的文字。
2. 返回自定义全局变量的值:self.myVar
3. 返回条件判断的值:self.myVar == 1
注意此功能不能对变量进行赋值操作,即不可以写self.myVar = 1这种,如果想要进行赋值操作,请选择上一个“在执行环境下获得Python表达式值(eval操作)”选项。
</pre>
注意此功能不能对变量进行赋值操作,即不可以写self.myVar = 1这种,如果想要进行赋值操作,请选择上一个“在执行环境下获得Python表达式值(eval操作)”选项。</pre>
<p style="margin-top: 15px">是否将执行后的输出/返回值作为字段记录:</p>
<p><select v-model='nowNode["parameters"]["recordASField"]' class="form-control">
<option :value = 0>否(仍可在任意操作中用Field["操作名"]表示此命令返回值)</option>
@ -434,12 +432,12 @@
</div>
<div class="elements" v-if="nodeType==7">
<p><input onkeydown="inputDelete(event)" type="checkbox" v-model='nowNode["parameters"]["iframe"]'></input>元素在iframe内</p>
<div v-if="nowNode['isInLoop']">
<!-- 如果在循环内才显示此行元素 -->
<p><input onkeydown="inputDelete(event)" type="checkbox" v-model='useLoop'></input>使用循环内的元素</p>
<p><input onkeydown="inputDelete(event)" type="checkbox" v-model='useLoop'></input>使用相对循环内的XPath定位的元素</p>
</div>
<p v-if='!useLoop'><input onkeydown="inputDelete(event)" type="checkbox" v-model='nowNode["parameters"]["iframe"]'></input>元素在iframe内</p>
<div v-if='!useLoop'>
<div>
<label>XPath: </label>
<textarea onkeydown="inputDelete(event)" class="form-control" rows="2" v-model='nowNode["parameters"]["xpath"]'></textarea>
<p><button type="button" data-toggle="modal" data-target="#myModal_XPath" @click="changeXPaths(nowNode['parameters']['allXPaths'])" class="btn btn-primary" style="margin-top: 10px">点此查看其他等价的XPath</button></p>

+ 3
- 0
ElectronJS/src/taskGrid/logic.js Parādīt failu

@ -65,13 +65,16 @@ function handleAddElement(msg) {
addElement(7, msg);
} else if (msg["type"] == "loopMouseMove") {
addElement(8, msg);
msg["xpath"] = ""; //循环移动到单个元素,单个元素的xpath设置为空
addElement(7, msg);
} else if (msg["type"] == "loopClickSingle") {
addElement(8, msg);
msg["xpath"] = ""; //循环点击单个元素,单个元素的xpath设置为空
addElement(2, msg);
app._data.nowArrow["position"] = -1; //循环点击单个元素,下一个要插入的位置一般在元素上方
} else if (msg["type"] == "loopClickEvery") {
addElement(8, msg);
msg["xpath"] = ""; //循环点击每个元素,单个元素的xpath设置为空
addElement(2, msg);
} else if (msg["type"] == "singleCollect" || msg["type"] == "multiCollectNoPattern") {
if (app._data.nowNode != null && app._data["nowNode"]["option"] == 3) { //如果现在节点就是提取数据节点,直接在此节点添加参数,而不是生成一个新的提取数据节点

+ 0
- 1
ElectronJS/tasks/160.json
Failā izmaiņas netiks attēlotas, jo tās ir par lielu
Parādīt failu


+ 1
- 0
ElectronJS/tasks/173.json
Failā izmaiņas netiks attēlotas, jo tās ir par lielu
Parādīt failu


+ 1
- 0
ElectronJS/tasks/174.json Parādīt failu

@ -0,0 +1 @@
{"id":174,"name":"京东全球版-专业的综合网上购物商城","url":"https://www.jd.com","links":"https://www.jd.com","create_time":"7/14/2023, 6:30:22 AM","update_time":"7/14/2023, 6:30:22 AM","version":"0.3.6","saveThreshold":10,"cloudflare":0,"environment":0,"maxViewLength":15,"outputFormat":"xlsx","saveName":"current_time","inputExcel":"","startFromExit":0,"containJudge":false,"desc":"https://www.jd.com","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://www.jd.com","desc":"要采集的网址列表,多行以\\n分开","type":"text","exampleValue":"https://www.jd.com"}],"outputParameters":[{"id":0,"name":"参数1_文本","desc":"","type":"text","recordASField":1,"exampleValue":"/手机/数码"},{"id":1,"name":"自定义参数_2","desc":"","type":"text","recordASField":1,"exampleValue":"自定义值"}],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,2],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","wait":0,"waitType":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"url":"https://www.jd.com","links":"https://www.jd.com","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"cookies":""}},{"id":2,"index":2,"parentId":0,"type":1,"option":8,"title":"循环","sequence":[3,4,5],"isInLoop":false,"position":1,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"/html/body/div[5]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"loopType":1,"pathList":"","textList":"","code":"","waitTime":0,"exitCount":0,"historyWait":2,"breakMode":0,"breakCode":"","breakCodeWaitTime":0,"allXPaths":["/html/body/div[5]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]","//div[contains(., '/手机/数码')]","//DIV[@class='LeftSide_menu_item__SBMWC LeftSide_text_space__2UhbG ']","/html/body/div[last()-6]/div/div[last()-4]/div/div[last()-2]/div/div/div/div[last()-1]/div[last()-12]"]}},{"id":3,"index":3,"parentId":2,"type":0,"option":3,"title":"提取数据","sequence":[],"isInLoop":true,"position":0,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"clear":0,"paras":[{"nodeType":0,"contentType":0,"relative":true,"name":"参数1_文本","desc":"","extractType":0,"relativeXPath":"","allXPaths":"","exampleValues":[{"num":0,"value":"/手机/数码"}],"unique_index":"8ercz26okavlk1q1er0","iframe":false,"default":"","paraType":"text","recordASField":1,"beforeJS":"","beforeJSWaitTime":0,"JS":"","JSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"downloadPic":0}],"loopType":1}},{"id":4,"index":4,"parentId":2,"type":0,"option":2,"title":"点击元素","sequence":[],"isInLoop":true,"position":1,"parameters":{"history":1,"tabIndex":0,"useLoop":true,"xpath":"/a[1]","iframe":false,"wait":2,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"clickWay":0,"maxWaitTime":10,"paras":[]}},{"id":5,"index":5,"parentId":2,"type":0,"option":3,"title":"提取数据","sequence":[],"isInLoop":true,"position":2,"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"clear":0,"paras":[{"nodeType":0,"contentType":0,"relative":false,"name":"自定义参数_2","desc":"","extractType":0,"relativeXPath":"//body","recordASField":1,"allXPaths":[],"exampleValues":[{"num":0,"value":"自定义值"}],"default":"","beforeJS":"","beforeJSWaitTime":0,"JS":"","JSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"downloadPic":0,"paraType":"text"}]}}]}

+ 1
- 0
ElectronJS/tasks/175.json Parādīt failu

@ -0,0 +1 @@
{"id":175,"name":"京东全球版-专业的综合网上购物商城","url":"https://www.jd.com","links":"https://www.jd.com","create_time":"7/14/2023, 6:34:39 AM","update_time":"7/14/2023, 6:34:39 AM","version":"0.3.6","saveThreshold":10,"cloudflare":0,"environment":0,"maxViewLength":15,"outputFormat":"xlsx","saveName":"current_time","inputExcel":"","startFromExit":0,"containJudge":false,"desc":"https://www.jd.com","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://www.jd.com","desc":"要采集的网址列表,多行以\\n分开","type":"text","exampleValue":"https://www.jd.com"}],"outputParameters":[],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,2],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","wait":0,"waitType":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"url":"https://www.jd.com","links":"https://www.jd.com","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"cookies":""}},{"id":2,"index":2,"parentId":0,"type":1,"option":8,"title":"循环","sequence":[3],"isInLoop":false,"position":1,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"/html/body/div[5]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"loopType":1,"pathList":"","textList":"","code":"","waitTime":0,"exitCount":0,"historyWait":2,"breakMode":0,"breakCode":"","breakCodeWaitTime":0,"allXPaths":""}},{"id":3,"index":3,"parentId":2,"type":0,"option":7,"title":"移动到元素","sequence":[],"isInLoop":true,"position":0,"parameters":{"history":4,"tabIndex":-1,"useLoop":true,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"allXPaths":"","loopType":1}}]}

+ 1
- 0
ElectronJS/tasks/176.json Parādīt failu

@ -0,0 +1 @@
{"id":176,"name":"京东全球版-专业的综合网上购物商城","url":"https://www.jd.com","links":"https://www.jd.com","create_time":"7/14/2023, 6:35:55 AM","update_time":"7/14/2023, 6:35:55 AM","version":"0.3.6","saveThreshold":10,"cloudflare":0,"environment":0,"maxViewLength":15,"outputFormat":"xlsx","saveName":"current_time","inputExcel":"","startFromExit":0,"containJudge":false,"desc":"https://www.jd.com","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://www.jd.com","desc":"要采集的网址列表,多行以\\n分开","type":"text","exampleValue":"https://www.jd.com"}],"outputParameters":[],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,2,3],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","wait":0,"waitType":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"url":"https://www.jd.com","links":"https://www.jd.com","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"cookies":""}},{"id":2,"index":2,"parentId":0,"type":0,"option":7,"title":"移动到元素","sequence":[],"isInLoop":false,"position":1,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"//*[contains(@class, \"LeftSide_menu_list__qXCeM\")]/div[2]","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"allXPaths":["/html/body/div[5]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[2]","//div[contains(., '/家用电器')]","//DIV[@class='LeftSide_menu_item__SBMWC LeftSide_text_space__2UhbG ']","/html/body/div[last()-5]/div/div[last()-4]/div/div[last()-2]/div/div/div/div[last()-1]/div[last()-11]"]}},{"id":3,"index":3,"parentId":0,"type":0,"option":7,"title":"移动到元素","sequence":[],"isInLoop":false,"position":2,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"//*[contains(@class, \"LeftSide_menu_list__qXCeM\")]/div[4]","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"allXPaths":["/html/body/div[5]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[4]","//div[contains(., '/家纺/家居/厨具')]","//DIV[@class='LeftSide_menu_item__SBMWC LeftSide_text_space__2UhbG ']","/html/body/div[last()-5]/div/div[last()-4]/div/div[last()-2]/div/div/div/div[last()-1]/div[last()-9]"]}}]}

+ 1
- 0
ElectronJS/tasks/177.json
Failā izmaiņas netiks attēlotas, jo tās ir par lielu
Parādīt failu


+ 1
- 1
ExecuteStage/.vscode/launch.json Parādīt failu

@ -12,7 +12,7 @@
"justMyCode": false,
// "args": ["--id", "[7]", "--read_type", "remote", "--headless", "0"]
// "args": ["--id", "[9]", "--read_type", "remote", "--headless", "0", "--saved_file_name", "YOUTUBE"]
"args": ["--id", "[58]", "--headless", "0", "--user_data", "1"]
"args": ["--id", "[65]", "--headless", "0", "--user_data", "1"]
}
]
}

+ 35
- 21
ExecuteStage/easyspider_executestage.py Parādīt failu

@ -513,14 +513,20 @@ class BrowserThread(Thread):
def moveToElement(self, para, loopElement=None, loopPath="", index=0):
time.sleep(0.1) # 移动之前等待0.1秒
loopPath = replace_field_values(loopPath, self.outputParameters)
xpath = replace_field_values(para["xpath"], self.outputParameters)
if para["useLoop"]: # 使用循环的情况下,传入的clickPath就是实际的xpath
path = loopPath
if xpath == "":
path = loopPath
else:
path = "(" + loopPath + ")" + \
"[" + str(index + 1) + "]" + \
xpath
index = 0 # 如果是相对循环内元素的点击,在定位到元素后,index应该重置为0
# element = loopElement
else:
index = 0
path = para["xpath"] # 不然使用元素定义的xpath
# element = self.browser.find_element(
# By.XPATH, path, iframe=para["iframe"])
path = xpath # 不然使用元素定义的xpath
path = replace_field_values(path, self.outputParameters)
try:
elements = self.browser.find_elements(
@ -529,11 +535,11 @@ class BrowserThread(Thread):
try:
ActionChains(self.browser).move_to_element(element).perform()
except:
print("移动鼠标到元素失败:", para["xpath"])
print("Failed to move mouse to element:", para["xpath"])
print("移动鼠标到元素失败:", xpath)
print("Failed to move mouse to element:", xpath)
except:
print("找不到元素:", para["xpath"])
print("Cannot find element:", para["xpath"])
print("找不到元素:", xpath)
print("Cannot find element:", xpath)
# 执行节点关键函数部分
@ -690,16 +696,17 @@ class BrowserThread(Thread):
# newBodyText = self.browser.page_source
# newBodyText = self.browser.find_element(By.XPATH, "//body").text
newBodyText = self.browser.find_element(By.CSS_SELECTOR, "body", iframe=node["parameters"]["iframe"]).text
if newBodyText == bodyText: # 如果页面内容无变化
print("页面已检测不到新内容,停止循环。")
print("No new content detected on the page, stop loop.")
finished = True
break
else:
if node["parameters"]["exitCount"] == 0:
print("检测到页面变化,继续循环。")
print("Page changed detected, continue loop.")
bodyText = newBodyText
if node["parameters"]["exitCount"] == 0:
if newBodyText == bodyText: # 如果页面内容无变化
print("页面已检测不到新内容,停止循环。")
print("No new content detected on the page, stop loop.")
finished = True
break
else:
if node["parameters"]["exitCount"] == 0:
print("检测到页面变化,继续循环。")
print("Page changed detected, continue loop.")
bodyText = newBodyText
xpath = replace_field_values(
node["parameters"]["xpath"], self.outputParameters)
element = self.browser.find_element(
@ -1081,15 +1088,22 @@ class BrowserThread(Thread):
try:
# element = self.browser.find_element(
# By.XPATH, path, iframe=para["iframe"])
clickPath = replace_field_values(clickPath, self.outputParameters)
xpath = replace_field_values(para["xpath"], self.outputParameters)
if para["useLoop"]: # 使用循环的情况下,传入的clickPath就是实际的xpath
path = clickPath
if xpath == "":
path = clickPath
else:
path = "(" + clickPath + ")" + \
"[" + str(index + 1) + "]" + \
xpath
index = 0 # 如果是相对循环内元素的点击,在定位到元素后,index应该重置为0
# element = loopElement
else:
index = 0
path = para["xpath"] # 不然使用元素定义的xpath
path = xpath # 不然使用元素定义的xpath
# element = self.browser.find_element(
# By.XPATH, path, iframe=para["iframe"])
path = replace_field_values(path, self.outputParameters)
elements = self.browser.find_elements(
By.XPATH, path, iframe=para["iframe"])
element = elements[index]

Notiek ielāde…
Atcelt
Saglabāt