Explorar el Código

Update Readmes

pull/73/head
naibo hace 1 año
padre
commit
3f971c845d
Se han modificado 16 ficheros con 260 adiciones y 314 borrados
  1. +0
    -52
      w
  2. +0
    -63
      特性.txt
  3. +56
    -4
      .temp_to_pub/EasySpider_Linux_amd64/readme.txt
  4. +70
    -5
      .temp_to_pub/EasySpider_Linux_amd64/软件使用说明.txt
  5. +0
    -0
      e
  6. +0
    -0
      rome
  7. +0
    -52
      w
  8. +1
    -3
      .temp_to_pub/EasySpider_MacOS_all_arch/readme.txt
  9. +0
    -1
      .temp_to_pub/EasySpider_MacOS_all_arch/浏览器闪退解决方案(点击设计任务后Chrome弹出后立马退出).txt
  10. +1
    -3
      .temp_to_pub/EasySpider_MacOS_all_arch/软件使用说明.txt
  11. +0
    -52
      w
  12. +0
    -63
      特性.txt
  13. +59
    -3
      .temp_to_pub/EasySpider_windows_386/readme.txt
  14. +67
    -3
      .temp_to_pub/EasySpider_windows_386/软件使用说明.txt
  15. +3
    -5
      .temp_to_pub/EasySpider_windows_amd64/Readme.txt
  16. +3
    -5
      .temp_to_pub/EasySpider_windows_amd64/软件使用说明.txt

.temp_to_pub/EasySpider_Linux_amd64/V0.3.1 → w Ver fichero

@ -1,52 +0,0 @@
## Update Instruction
1. Advanced Operations:
- Custom scripts can be executed in the workflow, including executing JavaScript commands in the browser and invoking scripts at the operating system level. The command's return value can be obtained and recorded, greatly expanding the scope of operations.
- Before and after each operation, you can specify a JavaScript command to be executed targeting the current located element.
2. Custom scripts are also supported in the conditions and loop conditions. The return value of the custom script determines the condition for the judgment of conditions and loops, greatly enhancing the flexibility of tasks. The ability to use the break statement within a loop is added, allowing custom operations to manipulate elements within the loop.
3. Multiple XPath expressions are generated simultaneously for user selection, and the XPath Helper extension is pre-installed for XPath debugging.
4. Added the functionality to extract the background image URL of elements, current page title, and current page URL.
5. Added the capability to save screenshots of elements or entire web pages. This feature works best in headless mode.
6. Added the functionality to download images.
7. Added OCR recognition of elements. To use this feature, Tesseract library needs to be installed first: https://tesseract-ocr.github.io/tessdoc/Installation.html
8. Directly extract the return value of executing JavaScript code on elements, allowing for functionalities such as regular expression matching and obtaining the background color of elements.
9. Added the capability to switch dropdown options and extract the selected value and text of dropdown options.
10. Significantly improved user guidance and explanations to make the software more user-friendly. This includes instructions on handling iframe tags, explanations of parameter meanings for various options, and explanations on modifying the XPath for loop items, and more.
11. Added instructions on how to execute tasks from the command line.
12. Added headless mode configuration, allowing the software to run without a browser interface.
13. Fixed the issue where Chinese paths couldn't be recognized correctly when using user-configured browser modes.
14. Fixed the issue where the program would freeze when there was no unconditional branch in the conditional branching.
15. Fixed the issue where the input box would freeze after saving a task.
16. Added the option to set the maximum waiting time for page load in the "Open Page" and "Click element" operations.
17. Added the functionality to move the mouse to an element.
18. Displays a prompt when an element cannot be found.
19. Fixed the webpage scrolling bug.
20. The task name is initialized with the value of the page title upon the first visit.
21. Added version update prompts.
22. Added the information of the publisher as requested.
23. Updated Chrome version to 113.

.temp_to_pub/EasySpider_Linux_amd64/V0.3.1 → 特性.txt Ver fichero

@ -1,63 +0,0 @@
如果下载速度慢,可以考虑中国境内下载地址:[中国境内下载地址](https://github.com/NaiboWang/EasySpider/releases/download/v0.3.0/Download_Link_Address_in_China_Mainland.txt)。
### 强烈建议大家观看新特性讲解视频
B站最新版特性视频已上传,新视频非常有用,推荐大家观看。
[【重要】自定义条件判断之使用循环项内的JS命令返回值 - 第二弹](https://www.bilibili.com/video/BV1mu411x7Nn/)
[如何同时执行多个任务(并行多开)](https://www.bilibili.com/video/BV13c411G7LE/)
[如何执行自己写的JS代码和系统代码 (自定义操作)](https://www.bilibili.com/video/BV1qs4y1z7Hc/)
[如何自定义循环和判断条件 - 第一弹](https://www.bilibili.com/video/BV1Ys4y1z777/)
[如何对元素和网页截图及(无头模式)命令行执行指南](https://www.bilibili.com/video/BV1dV4y1z764/)
[OCR识别元素内容功能](https://www.bilibili.com/video/BV1xz4y1b72D/)
注意,v0.3.1版本任务task文件夹内`.json`文件和之前所有版本均不兼容,请重新设计v0.3.1版本任务。
## 更新说明
1. 高级操作:
- 可以在任务流程中**执行自定义脚本**,包括在浏览器中**执行Javascript指令**以及**操作系统级别的脚本调用**并可**得到命令返回值并记录**,大大扩展了可操作空间。
![image](https://github.com/NaiboWang/EasySpider/assets/30287768/06e63a06-328d-4339-b40b-2d57c94cee66)
- 在每一个操作执行前和执行后,都可以指定执行一段针对当前定位元素的JavaScript指令。
<img src="https://github.com/NaiboWang/EasySpider/assets/30287768/dde64388-5668-40ff-951e-fb8f60655c49" height=50% width=50%>
2. **判断条件和循环条件**中同样增加了**执行自定义脚本**,并根据自定义脚本的返回值是否为真来作为条件判断和循环的判断条件,同样极大的增加了任务的可操作性。循环中增加了用代码break的操作设定,自定义操作可以操作循环内元素。
![image](https://github.com/NaiboWang/EasySpider/assets/30287768/9dea0564-1a1c-487d-9fa4-427c5e284796)
<img src="https://github.com/NaiboWang/EasySpider/assets/30287768/5ce7cf50-e5c9-4714-a83b-9c65934e9c68" width=50%></img>
3. 可同时生成多种XPath供用户选择,并**预装了XPath Helper扩展**供大家调试XPath。
4. 增加采集元素背景图片地址,当前页面标题,当前页面URL地址功能。
5. 增加保存元素截图功能,如要截图某元素或整个网页页面,可以用此功能(配合无头模式效果更好)。
6. 增加下载图片功能。
7. 增加OCR识别元素功能(使用此功能需首先自行安装Tesseract库:[https://blog.csdn.net/u010454030/article/details/80515501](https://blog.csdn.net/u010454030/article/details/80515501))
8. 可直接提取对元素执行JavaScript代码后的返回值,实现如正则表达式,获得元素背景颜色等功能。
9. 增加切换下拉选项功能,采集下拉选项正在选中的值和文本。
<img src="https://github.com/NaiboWang/EasySpider/assets/30287768/c0b2bec1-2a97-4516-930e-1b310697212b" width=50%></img>
![image](https://github.com/NaiboWang/EasySpider/assets/30287768/42cc0009-00d1-4c5c-af47-0fa6340fba80)
10. 大幅增加使用提示和说明,使软件更易用(如增加了iframe标签的处理方式说明,各个选项的参数意义,以及循环项XPath的修改说明等等)。
11. 执行命令时增加了如何用命令行执行任务的提示:[https://github.com/NaiboWang/EasySpider/wiki/Argument-Instruction](https://github.com/NaiboWang/EasySpider/wiki/Argument-Instruction)。
![image](https://github.com/NaiboWang/EasySpider/assets/30287768/a9e774df-e345-4d51-b7c9-2c4dac0ec624)
12. 增加并行多开模式。
13. 增加无头模式,即无浏览器界面模式配置。
14. 修复了使用用户配置浏览器模式下的中文路径不能正确识别的问题。
15. 修复了条件分支没有无条件分支时会卡死的问题。
16. 修复了保存任务后会输入框卡死的问题。
17. 打开网页操作和点击元素操作新增设置页面最长加载等待时间。
18. 增加了鼠标移动到元素功能。
19. 找不到元素时会提示。
20. 修复网页滚动Bug。
21. 增加新增提取数据字段操作。
22. 任务名称初始化为第一次进入页面的标题值。
23. 增加版本更新提示。
24. 应要求增加出品方信息。
25. 更新chrome版本为113。

+ 56
- 4
.temp_to_pub/EasySpider_Linux_amd64/readme.txt Ver fichero

@ -8,10 +8,62 @@ Welcome to promote this software to other friends.
This version is for Ubuntu 20.04, Debian, Deepin x64 and above.
Please wait for at most 20 seconds if you see a white screen when open EasySpider.
Video Tutorial: https://youtube.com/playlist?list=PL0kEFEkWrT7mt9MUlEBV2DTo1QsaanUTp
The software is totally not trojan/virus! If mistaken by antivirus software such as windows defender as a virus, please recover it, or open "EasySpider.bat" to run our software instead.
Tasks can be imported from other machines by simply placing the .json files from the "tasks" folder of those machines into the "tasks" folder of this directory. Similarly, execution instance files can be imported by copying the .json files from the "execution_instances" folder. Note that only files named with a number greater than 0 are supported in both folders.
======Version Update Instruction======
-----v0.3.1-----
1. Advanced Operations:
- Custom scripts can be executed in the workflow, including executing JavaScript commands in the browser and invoking scripts at the operating system level. The command's return value can be obtained and recorded, greatly expanding the scope of operations.
- Before and after each operation, you can specify a JavaScript command to be executed targeting the current located element.
2. Custom scripts are also supported in the conditions and loop conditions. The return value of the custom script determines the condition for the judgment of conditions and loops, greatly enhancing the flexibility of tasks. The ability to use the break statement within a loop is added, allowing custom operations to manipulate elements within the loop.
3. Multiple XPath expressions are generated simultaneously for user selection, and the XPath Helper extension is pre-installed for XPath debugging.
4. Added the functionality to extract the background image URL of elements, current page title, and current page URL.
5. Added the capability to save screenshots of elements or entire web pages. This feature works best in headless mode.
6. Added the functionality to download images.
7. Added OCR recognition of elements. To use this feature, Tesseract library needs to be installed first: https://tesseract-ocr.github.io/tessdoc/Installation.html
8. Directly extract the return value of executing JavaScript code on elements, allowing for functionalities such as regular expression matching and obtaining the background color of elements.
9. Added the capability to switch dropdown options and extract the selected value and text of dropdown options.
10. Significantly improved user guidance and explanations to make the software more user-friendly. This includes instructions on handling iframe tags, explanations of parameter meanings for various options, and explanations on modifying the XPath for loop items, and more.
11. Added instructions on how to execute tasks from the command line.
12. Added headless mode configuration, allowing the software to run without a browser interface.
13. Fixed the issue where Chinese paths couldn't be recognized correctly when using user-configured browser modes.
14. Fixed the issue where the program would freeze when there was no unconditional branch in the conditional branching.
15. Fixed the issue where the input box would freeze after saving a task.
16. Added the option to set the maximum waiting time for page load in the "Open Page" and "Click element" operations.
17. Added the functionality to move the mouse to an element.
18. Displays a prompt when an element cannot be found.
19. Fixed the webpage scrolling bug.
20. The task name is initialized with the value of the page title upon the first visit.
21. Added version update prompts.
22. Added the information of the publisher as requested.
23. Updated Chrome version to 113.

+ 70
- 5
.temp_to_pub/EasySpider_Linux_amd64/软件使用说明.txt Ver fichero

@ -6,13 +6,78 @@
官方网址: https://github.com/NaiboWang/EasySpider
支持Windows 10 x64及以上版本。
打开如果白屏请等待最多20秒,界面就会显示。
支持Ubuntu 20.04, Debian, Deepin x64及以上版本。
视频教程:https://www.bilibili.com/video/BV1Fk4y1L7xX/
这个软件绝对不是特洛伊木马/病毒!如果被像 Windows Defender 这样的杀毒软件误认为是病毒,请进行恢复,或者打开“EasySpider.bat”来运行我们的软件。
可以从其他机器导入任务,只需要把其他机器的tasks文件夹里的.json文件放入此目录的tasks文件夹里即可。同理执行号文件可以通过复制execution_instances文件夹中的.json文件来导入。注意,两个文件夹里的.json文件只支持命名为大于0的数字。
======版本更新说明======
-----V0.3.1-----
如果下载速度慢,可以考虑中国境内下载地址:[中国境内下载地址](https://github.com/NaiboWang/EasySpider/releases/download/v0.3.0/Download_Link_Address_in_China_Mainland.txt)。
### 强烈建议大家观看新特性讲解视频
B站最新版特性视频已上传,新视频非常有用,推荐大家观看。
[【重要】自定义条件判断之使用循环项内的JS命令返回值 - 第二弹](https://www.bilibili.com/video/BV1mu411x7Nn/)
[如何同时执行多个任务(并行多开)](https://www.bilibili.com/video/BV13c411G7LE/)
[如何执行自己写的JS代码和系统代码 (自定义操作)](https://www.bilibili.com/video/BV1qs4y1z7Hc/)
[如何自定义循环和判断条件 - 第一弹](https://www.bilibili.com/video/BV1Ys4y1z777/)
[如何对元素和网页截图及(无头模式)命令行执行指南](https://www.bilibili.com/video/BV1dV4y1z764/)
[OCR识别元素内容功能](https://www.bilibili.com/video/BV1xz4y1b72D/)
注意,v0.3.1版本任务task文件夹内`.json`文件和之前所有版本均不兼容,请重新设计v0.3.1版本任务。
## 更新说明
1. 高级操作:
- 可以在任务流程中**执行自定义脚本**,包括在浏览器中**执行Javascript指令**以及**操作系统级别的脚本调用**并可**得到命令返回值并记录**,大大扩展了可操作空间。
![image](https://github.com/NaiboWang/EasySpider/assets/30287768/06e63a06-328d-4339-b40b-2d57c94cee66)
- 在每一个操作执行前和执行后,都可以指定执行一段针对当前定位元素的JavaScript指令。
<img src="https://github.com/NaiboWang/EasySpider/assets/30287768/dde64388-5668-40ff-951e-fb8f60655c49" height=50% width=50%>
2. **判断条件和循环条件**中同样增加了**执行自定义脚本**,并根据自定义脚本的返回值是否为真来作为条件判断和循环的判断条件,同样极大的增加了任务的可操作性。循环中增加了用代码break的操作设定,自定义操作可以操作循环内元素。
![image](https://github.com/NaiboWang/EasySpider/assets/30287768/9dea0564-1a1c-487d-9fa4-427c5e284796)
<img src="https://github.com/NaiboWang/EasySpider/assets/30287768/5ce7cf50-e5c9-4714-a83b-9c65934e9c68" width=50%></img>
3. 可同时生成多种XPath供用户选择,并**预装了XPath Helper扩展**供大家调试XPath。
4. 增加采集元素背景图片地址,当前页面标题,当前页面URL地址功能。
5. 增加保存元素截图功能,如要截图某元素或整个网页页面,可以用此功能(配合无头模式效果更好)。
6. 增加下载图片功能。
7. 增加OCR识别元素功能(使用此功能需首先自行安装Tesseract库:[https://blog.csdn.net/u010454030/article/details/80515501](https://blog.csdn.net/u010454030/article/details/80515501))
8. 可直接提取对元素执行JavaScript代码后的返回值,实现如正则表达式,获得元素背景颜色等功能。
9. 增加切换下拉选项功能,采集下拉选项正在选中的值和文本。
<img src="https://github.com/NaiboWang/EasySpider/assets/30287768/c0b2bec1-2a97-4516-930e-1b310697212b" width=50%></img>
![image](https://github.com/NaiboWang/EasySpider/assets/30287768/42cc0009-00d1-4c5c-af47-0fa6340fba80)
10. 大幅增加使用提示和说明,使软件更易用(如增加了iframe标签的处理方式说明,各个选项的参数意义,以及循环项XPath的修改说明等等)。
11. 执行命令时增加了如何用命令行执行任务的提示:[https://github.com/NaiboWang/EasySpider/wiki/Argument-Instruction](https://github.com/NaiboWang/EasySpider/wiki/Argument-Instruction)。
![image](https://github.com/NaiboWang/EasySpider/assets/30287768/a9e774df-e345-4d51-b7c9-2c4dac0ec624)
12. 增加并行多开模式。
13. 增加无头模式,即无浏览器界面模式配置。
14. 修复了使用用户配置浏览器模式下的中文路径不能正确识别的问题。
15. 修复了条件分支没有无条件分支时会卡死的问题。
16. 修复了保存任务后会输入框卡死的问题。
17. 打开网页操作和点击元素操作新增设置页面最长加载等待时间。
18. 增加了鼠标移动到元素功能。
19. 找不到元素时会提示。
20. 修复网页滚动Bug。
21. 增加新增提取数据字段操作。
22. 任务名称初始化为第一次进入页面的标题值。
23. 增加版本更新提示。
24. 应要求增加出品方信息。
25. 更新chrome版本为113。

.temp_to_pub/EasySpider_MacOS_all_arch/Please → e Ver fichero


.temp_to_pub/EasySpider_MacOS_all_arch/If → rome Ver fichero


.temp_to_pub/EasySpider_MacOS_all_arch/V0.3.1 → w Ver fichero

@ -1,52 +0,0 @@
## Update Instruction
1. Advanced Operations:
- Custom scripts can be executed in the workflow, including executing JavaScript commands in the browser and invoking scripts at the operating system level. The command's return value can be obtained and recorded, greatly expanding the scope of operations.
- Before and after each operation, you can specify a JavaScript command to be executed targeting the current located element.
2. Custom scripts are also supported in the conditions and loop conditions. The return value of the custom script determines the condition for the judgment of conditions and loops, greatly enhancing the flexibility of tasks. The ability to use the break statement within a loop is added, allowing custom operations to manipulate elements within the loop.
3. Multiple XPath expressions are generated simultaneously for user selection, and the XPath Helper extension is pre-installed for XPath debugging.
4. Added the functionality to extract the background image URL of elements, current page title, and current page URL.
5. Added the capability to save screenshots of elements or entire web pages. This feature works best in headless mode.
6. Added the functionality to download images.
7. Added OCR recognition of elements. To use this feature, Tesseract library needs to be installed first: https://tesseract-ocr.github.io/tessdoc/Installation.html
8. Directly extract the return value of executing JavaScript code on elements, allowing for functionalities such as regular expression matching and obtaining the background color of elements.
9. Added the capability to switch dropdown options and extract the selected value and text of dropdown options.
10. Significantly improved user guidance and explanations to make the software more user-friendly. This includes instructions on handling iframe tags, explanations of parameter meanings for various options, and explanations on modifying the XPath for loop items, and more.
11. Added instructions on how to execute tasks from the command line.
12. Added headless mode configuration, allowing the software to run without a browser interface.
13. Fixed the issue where Chinese paths couldn't be recognized correctly when using user-configured browser modes.
14. Fixed the issue where the program would freeze when there was no unconditional branch in the conditional branching.
15. Fixed the issue where the input box would freeze after saving a task.
16. Added the option to set the maximum waiting time for page load in the "Open Page" and "Click element" operations.
17. Added the functionality to move the mouse to an element.
18. Displays a prompt when an element cannot be found.
19. Fixed the webpage scrolling bug.
20. The task name is initialized with the value of the page title upon the first visit.
21. Added version update prompts.
22. Added the information of the publisher as requested.
23. Updated Chrome version to 113.

+ 1
- 3
.temp_to_pub/EasySpider_MacOS_all_arch/readme.txt Ver fichero

@ -2,9 +2,7 @@ Official Site: https://github.com/NaiboWang/EasySpider
Welcome to promote this software to other friends.
This version is for MacOS, can be used on all Chips, including Intel (such as Corel i7) and Arm (such as M1).
Please wait for at most 20 seconds if you see a white screen when open EasySpider.
This version is for MacOS, can be used on all Chips, including Intel (such as Corel i7) and Arm (such as M1). Support on MacOS 11.x and above.
Video Tutorial: https://youtube.com/playlist?list=PL0kEFEkWrT7mt9MUlEBV2DTo1QsaanUTp

+ 0
- 1
.temp_to_pub/EasySpider_MacOS_all_arch/浏览器闪退解决方案(点击设计任务后Chrome弹出后立马退出).txt Ver fichero

@ -1,4 +1,3 @@
如果点击"使用浏览器设计"按钮后Chrome弹出并闪退,需要按照以下流程操作:
MacOS版本的软件有一个问题可能存在,即软件所调用的Chrome软件会在打开后经常性自动更新,但软件所依赖的Chromedriver版本并不会随着chrome自动更新,从而导致软件打不开chrome的问题。注意此版本的EasySpider使用的Chrome为113.0版本。

+ 1
- 3
.temp_to_pub/EasySpider_MacOS_all_arch/软件使用说明.txt Ver fichero

@ -2,9 +2,7 @@
官方网址: https://github.com/NaiboWang/EasySpider
支持MacOS系统,包括Intel芯片和Arm芯片,如酷睿i7和M1芯片。
打开如果白屏请等待最多20秒,界面就会显示。
支持MacOS系统,包括Intel芯片和Arm芯片,如酷睿i7和M1芯片,最低MacOS系统版本为11.x。
视频教程:https://www.bilibili.com/video/BV1Fk4y1L7xX/

.temp_to_pub/EasySpider_windows_386/V0.3.1 → w Ver fichero

@ -1,52 +0,0 @@
## Update Instruction
1. Advanced Operations:
- Custom scripts can be executed in the workflow, including executing JavaScript commands in the browser and invoking scripts at the operating system level. The command's return value can be obtained and recorded, greatly expanding the scope of operations.
- Before and after each operation, you can specify a JavaScript command to be executed targeting the current located element.
2. Custom scripts are also supported in the conditions and loop conditions. The return value of the custom script determines the condition for the judgment of conditions and loops, greatly enhancing the flexibility of tasks. The ability to use the break statement within a loop is added, allowing custom operations to manipulate elements within the loop.
3. Multiple XPath expressions are generated simultaneously for user selection, and the XPath Helper extension is pre-installed for XPath debugging.
4. Added the functionality to extract the background image URL of elements, current page title, and current page URL.
5. Added the capability to save screenshots of elements or entire web pages. This feature works best in headless mode.
6. Added the functionality to download images.
7. Added OCR recognition of elements. To use this feature, Tesseract library needs to be installed first: https://tesseract-ocr.github.io/tessdoc/Installation.html
8. Directly extract the return value of executing JavaScript code on elements, allowing for functionalities such as regular expression matching and obtaining the background color of elements.
9. Added the capability to switch dropdown options and extract the selected value and text of dropdown options.
10. Significantly improved user guidance and explanations to make the software more user-friendly. This includes instructions on handling iframe tags, explanations of parameter meanings for various options, and explanations on modifying the XPath for loop items, and more.
11. Added instructions on how to execute tasks from the command line.
12. Added headless mode configuration, allowing the software to run without a browser interface.
13. Fixed the issue where Chinese paths couldn't be recognized correctly when using user-configured browser modes.
14. Fixed the issue where the program would freeze when there was no unconditional branch in the conditional branching.
15. Fixed the issue where the input box would freeze after saving a task.
16. Added the option to set the maximum waiting time for page load in the "Open Page" and "Click element" operations.
17. Added the functionality to move the mouse to an element.
18. Displays a prompt when an element cannot be found.
19. Fixed the webpage scrolling bug.
20. The task name is initialized with the value of the page title upon the first visit.
21. Added version update prompts.
22. Added the information of the publisher as requested.
23. Updated Chrome version to 113.

.temp_to_pub/EasySpider_windows_386/V0.3.1 → 特性.txt Ver fichero

@ -1,63 +0,0 @@
如果下载速度慢,可以考虑中国境内下载地址:[中国境内下载地址](https://github.com/NaiboWang/EasySpider/releases/download/v0.3.0/Download_Link_Address_in_China_Mainland.txt)。
### 强烈建议大家观看新特性讲解视频
B站最新版特性视频已上传,新视频非常有用,推荐大家观看。
[【重要】自定义条件判断之使用循环项内的JS命令返回值 - 第二弹](https://www.bilibili.com/video/BV1mu411x7Nn/)
[如何同时执行多个任务(并行多开)](https://www.bilibili.com/video/BV13c411G7LE/)
[如何执行自己写的JS代码和系统代码 (自定义操作)](https://www.bilibili.com/video/BV1qs4y1z7Hc/)
[如何自定义循环和判断条件 - 第一弹](https://www.bilibili.com/video/BV1Ys4y1z777/)
[如何对元素和网页截图及(无头模式)命令行执行指南](https://www.bilibili.com/video/BV1dV4y1z764/)
[OCR识别元素内容功能](https://www.bilibili.com/video/BV1xz4y1b72D/)
注意,v0.3.1版本任务task文件夹内`.json`文件和之前所有版本均不兼容,请重新设计v0.3.1版本任务。
## 更新说明
1. 高级操作:
- 可以在任务流程中**执行自定义脚本**,包括在浏览器中**执行Javascript指令**以及**操作系统级别的脚本调用**并可**得到命令返回值并记录**,大大扩展了可操作空间。
![image](https://github.com/NaiboWang/EasySpider/assets/30287768/06e63a06-328d-4339-b40b-2d57c94cee66)
- 在每一个操作执行前和执行后,都可以指定执行一段针对当前定位元素的JavaScript指令。
<img src="https://github.com/NaiboWang/EasySpider/assets/30287768/dde64388-5668-40ff-951e-fb8f60655c49" height=50% width=50%>
2. **判断条件和循环条件**中同样增加了**执行自定义脚本**,并根据自定义脚本的返回值是否为真来作为条件判断和循环的判断条件,同样极大的增加了任务的可操作性。循环中增加了用代码break的操作设定,自定义操作可以操作循环内元素。
![image](https://github.com/NaiboWang/EasySpider/assets/30287768/9dea0564-1a1c-487d-9fa4-427c5e284796)
<img src="https://github.com/NaiboWang/EasySpider/assets/30287768/5ce7cf50-e5c9-4714-a83b-9c65934e9c68" width=50%></img>
3. 可同时生成多种XPath供用户选择,并**预装了XPath Helper扩展**供大家调试XPath。
4. 增加采集元素背景图片地址,当前页面标题,当前页面URL地址功能。
5. 增加保存元素截图功能,如要截图某元素或整个网页页面,可以用此功能(配合无头模式效果更好)。
6. 增加下载图片功能。
7. 增加OCR识别元素功能(使用此功能需首先自行安装Tesseract库:[https://blog.csdn.net/u010454030/article/details/80515501](https://blog.csdn.net/u010454030/article/details/80515501))
8. 可直接提取对元素执行JavaScript代码后的返回值,实现如正则表达式,获得元素背景颜色等功能。
9. 增加切换下拉选项功能,采集下拉选项正在选中的值和文本。
<img src="https://github.com/NaiboWang/EasySpider/assets/30287768/c0b2bec1-2a97-4516-930e-1b310697212b" width=50%></img>
![image](https://github.com/NaiboWang/EasySpider/assets/30287768/42cc0009-00d1-4c5c-af47-0fa6340fba80)
10. 大幅增加使用提示和说明,使软件更易用(如增加了iframe标签的处理方式说明,各个选项的参数意义,以及循环项XPath的修改说明等等)。
11. 执行命令时增加了如何用命令行执行任务的提示:[https://github.com/NaiboWang/EasySpider/wiki/Argument-Instruction](https://github.com/NaiboWang/EasySpider/wiki/Argument-Instruction)。
![image](https://github.com/NaiboWang/EasySpider/assets/30287768/a9e774df-e345-4d51-b7c9-2c4dac0ec624)
12. 增加并行多开模式。
13. 增加无头模式,即无浏览器界面模式配置。
14. 修复了使用用户配置浏览器模式下的中文路径不能正确识别的问题。
15. 修复了条件分支没有无条件分支时会卡死的问题。
16. 修复了保存任务后会输入框卡死的问题。
17. 打开网页操作和点击元素操作新增设置页面最长加载等待时间。
18. 增加了鼠标移动到元素功能。
19. 找不到元素时会提示。
20. 修复网页滚动Bug。
21. 增加新增提取数据字段操作。
22. 任务名称初始化为第一次进入页面的标题值。
23. 增加版本更新提示。
24. 应要求增加出品方信息。
25. 更新chrome版本为113。

+ 59
- 3
.temp_to_pub/EasySpider_windows_386/readme.txt Ver fichero

@ -4,10 +4,66 @@ Welcome to promote this software to other friends.
This version is for Windows 10 x32 and above.
Please wait for at most 20 seconds if you see a white screen when open EasySpider.
Video Tutorial: https://youtube.com/playlist?list=PL0kEFEkWrT7mt9MUlEBV2DTo1QsaanUTp
The software is totally not trojan/virus! If mistaken by antivirus software such as windows defender as a virus, please recover it, or open "EasySpider.bat" to run our software instead.
The software is totally not trojan/virus! If mistaken by antivirus software such as Windows Defender as a virus, please recover it, or open "EasySpider.bat" to run our software instead.
Tasks can be imported from other machines by simply placing the .json files from the "tasks" folder of those machines into the "tasks" folder of this directory. Similarly, execution instance files can be imported by copying the .json files from the "execution_instances" folder. Note that only files named with a number greater than 0 are supported in both folders.
======Version New Features======
-----v0.3.1-----
## Update Instruction
1. Advanced Operations:
- Custom scripts can be executed in the workflow, including executing JavaScript commands in the browser and invoking scripts at the operating system level. The command's return value can be obtained and recorded, greatly expanding the scope of operations.
- Before and after each operation, you can specify a JavaScript command to be executed targeting the current located element.
2. Custom scripts are also supported in the conditions and loop conditions. The return value of the custom script determines the condition for the judgment of conditions and loops, greatly enhancing the flexibility of tasks. The ability to use the break statement within a loop is added, allowing custom operations to manipulate elements within the loop.
3. Multiple XPath expressions are generated simultaneously for user selection, and the XPath Helper extension is pre-installed for XPath debugging.
4. Added the functionality to extract the background image URL of elements, current page title, and current page URL.
5. Added the capability to save screenshots of elements or entire web pages. This feature works best in headless mode.
6. Added the functionality to download images.
7. Added OCR recognition of elements. To use this feature, Tesseract library needs to be installed first: https://tesseract-ocr.github.io/tessdoc/Installation.html
8. Directly extract the return value of executing JavaScript code on elements, allowing for functionalities such as regular expression matching and obtaining the background color of elements.
9. Added the capability to switch dropdown options and extract the selected value and text of dropdown options.
10. Significantly improved user guidance and explanations to make the software more user-friendly. This includes instructions on handling iframe tags, explanations of parameter meanings for various options, and explanations on modifying the XPath for loop items, and more.
11. Added instructions on how to execute tasks from the command line.
12. Added headless mode configuration, allowing the software to run without a browser interface.
13. Fixed the issue where Chinese paths couldn't be recognized correctly when using user-configured browser modes.
14. Fixed the issue where the program would freeze when there was no unconditional branch in the conditional branching.
15. Fixed the issue where the input box would freeze after saving a task.
16. Added the option to set the maximum waiting time for page load in the "Open Page" and "Click element" operations.
17. Added the functionality to move the mouse to an element.
18. Displays a prompt when an element cannot be found.
19. Fixed the webpage scrolling bug.
20. The task name is initialized with the value of the page title upon the first visit.
21. Added version update prompts.
22. Added the information of the publisher as requested.
23. Updated Chrome version to 113.

+ 67
- 3
.temp_to_pub/EasySpider_windows_386/软件使用说明.txt Ver fichero

@ -4,11 +4,75 @@
支持Windows 10 x32及以上版本。
打开如果白屏请等待最多20秒,界面就会显示。
视频教程:https://www.bilibili.com/video/BV1Fk4y1L7xX/
这个软件绝对不是特洛伊木马/病毒!如果被像 Windows Defender 这样的杀毒软件误认为是病毒,请进行恢复,或者打开“EasySpider.bat”来运行我们的软件。
这个软件绝对不是特洛伊木马/病毒!如果被像Windows Defender这样的杀毒软件误认为是病毒,请进行恢复,或者打开“EasySpider.bat”来运行我们的软件。
可以从其他机器导入任务,只需要把其他机器的tasks文件夹里的.json文件放入此目录的tasks文件夹里即可。同理执行号文件可以通过复制execution_instances文件夹中的.json文件来导入。注意,两个文件夹里的.json文件只支持命名为大于0的数字。
======版本更新说明======
-----v0.3.1-----
### 强烈建议大家观看新特性讲解视频
B站最新版特性视频已上传,新视频非常有用,推荐大家观看。
[【重要】自定义条件判断之使用循环项内的JS命令返回值 - 第二弹](https://www.bilibili.com/video/BV1mu411x7Nn/)
[如何同时执行多个任务(并行多开)](https://www.bilibili.com/video/BV13c411G7LE/)
[如何执行自己写的JS代码和系统代码 (自定义操作)](https://www.bilibili.com/video/BV1qs4y1z7Hc/)
[如何自定义循环和判断条件 - 第一弹](https://www.bilibili.com/video/BV1Ys4y1z777/)
[如何对元素和网页截图及(无头模式)命令行执行指南](https://www.bilibili.com/video/BV1dV4y1z764/)
[OCR识别元素内容功能](https://www.bilibili.com/video/BV1xz4y1b72D/)
注意,v0.3.1版本任务task文件夹内`.json`文件和之前所有版本均不兼容,请重新设计v0.3.1版本任务。
## 更新说明
1. 高级操作:
- 可以在任务流程中**执行自定义脚本**,包括在浏览器中**执行Javascript指令**以及**操作系统级别的脚本调用**并可**得到命令返回值并记录**,大大扩展了可操作空间。
![image](https://github.com/NaiboWang/EasySpider/assets/30287768/06e63a06-328d-4339-b40b-2d57c94cee66)
- 在每一个操作执行前和执行后,都可以指定执行一段针对当前定位元素的JavaScript指令。
<img src="https://github.com/NaiboWang/EasySpider/assets/30287768/dde64388-5668-40ff-951e-fb8f60655c49" height=50% width=50%>
2. **判断条件和循环条件**中同样增加了**执行自定义脚本**,并根据自定义脚本的返回值是否为真来作为条件判断和循环的判断条件,同样极大的增加了任务的可操作性。循环中增加了用代码break的操作设定,自定义操作可以操作循环内元素。
![image](https://github.com/NaiboWang/EasySpider/assets/30287768/9dea0564-1a1c-487d-9fa4-427c5e284796)
<img src="https://github.com/NaiboWang/EasySpider/assets/30287768/5ce7cf50-e5c9-4714-a83b-9c65934e9c68" width=50%></img>
3. 可同时生成多种XPath供用户选择,并**预装了XPath Helper扩展**供大家调试XPath。
4. 增加采集元素背景图片地址,当前页面标题,当前页面URL地址功能。
5. 增加保存元素截图功能,如要截图某元素或整个网页页面,可以用此功能(配合无头模式效果更好)。
6. 增加下载图片功能。
7. 增加OCR识别元素功能(使用此功能需首先自行安装Tesseract库:[https://blog.csdn.net/u010454030/article/details/80515501](https://blog.csdn.net/u010454030/article/details/80515501))
8. 可直接提取对元素执行JavaScript代码后的返回值,实现如正则表达式,获得元素背景颜色等功能。
9. 增加切换下拉选项功能,采集下拉选项正在选中的值和文本。
<img src="https://github.com/NaiboWang/EasySpider/assets/30287768/c0b2bec1-2a97-4516-930e-1b310697212b" width=50%></img>
![image](https://github.com/NaiboWang/EasySpider/assets/30287768/42cc0009-00d1-4c5c-af47-0fa6340fba80)
10. 大幅增加使用提示和说明,使软件更易用(如增加了iframe标签的处理方式说明,各个选项的参数意义,以及循环项XPath的修改说明等等)。
11. 执行命令时增加了如何用命令行执行任务的提示:[https://github.com/NaiboWang/EasySpider/wiki/Argument-Instruction](https://github.com/NaiboWang/EasySpider/wiki/Argument-Instruction)。
![image](https://github.com/NaiboWang/EasySpider/assets/30287768/a9e774df-e345-4d51-b7c9-2c4dac0ec624)
12. 增加并行多开模式。
13. 增加无头模式,即无浏览器界面模式配置。
14. 修复了使用用户配置浏览器模式下的中文路径不能正确识别的问题。
15. 修复了条件分支没有无条件分支时会卡死的问题。
16. 修复了保存任务后会输入框卡死的问题。
17. 打开网页操作和点击元素操作新增设置页面最长加载等待时间。
18. 增加了鼠标移动到元素功能。
19. 找不到元素时会提示。
20. 修复网页滚动Bug。
21. 增加新增提取数据字段操作。
22. 任务名称初始化为第一次进入页面的标题值。
23. 增加版本更新提示。
24. 应要求增加出品方信息。
25. 更新chrome版本为113。

+ 3
- 5
.temp_to_pub/EasySpider_windows_amd64/Readme.txt Ver fichero

@ -4,18 +4,16 @@ Welcome to promote this software to other friends.
This version is for Windows 10 x64 and above.
Please wait for at most 20 seconds if you see a white screen when open EasySpider.
Video Tutorial: https://youtube.com/playlist?list=PL0kEFEkWrT7mt9MUlEBV2DTo1QsaanUTp
The software is totally not trojan/virus! If mistaken by antivirus software such as windows defender as a virus, please recover it, or open "EasySpider.bat" to run our software instead.
The software is totally not trojan/virus! If mistaken by antivirus software such as Windows Defender as a virus, please recover it, or open "EasySpider.bat" to run our software instead.
Tasks can be imported from other machines by simply placing the .json files from the "tasks" folder of those machines into the "tasks" folder of this directory. Similarly, execution instance files can be imported by copying the .json files from the "execution_instances" folder. Note that only files named with a number greater than 0 are supported in both folders.
======New Features======
======Version Update Instructions======
# V0.3.1
-----v0.3.1-----
## Update Instruction

+ 3
- 5
.temp_to_pub/EasySpider_windows_amd64/软件使用说明.txt Ver fichero

@ -4,18 +4,16 @@
支持Windows 10 x64及以上版本。
打开如果白屏请等待最多20秒,界面就会显示。
视频教程:https://www.bilibili.com/video/BV1Fk4y1L7xX/
这个软件绝对不是特洛伊木马/病毒!如果被像 Windows Defender 这样的杀毒软件误认为是病毒,请进行恢复,或者打开“EasySpider.bat”来运行我们的软件。
这个软件绝对不是特洛伊木马/病毒!如果被像Windows Defender这样的杀毒软件误认为是病毒,请进行恢复,或者打开“EasySpider.bat”来运行我们的软件。
可以从其他机器导入任务,只需要把其他机器的tasks文件夹里的.json文件放入此目录的tasks文件夹里即可。同理执行号文件可以通过复制execution_instances文件夹中的.json文件来导入。注意,两个文件夹里的.json文件只支持命名为大于0的数字。
======版本历代新特性======
======版本更新说明======
## v0.3.1
-----v0.3.1-----
### 强烈建议大家观看新特性讲解视频

Cargando…
Cancelar
Guardar