Great Statistics Tools with Nice Debugging Environment

As is well known, in the development process of a program, the time consumed to remove and correct the error is usually greater than time spent on coding. Therefore, a friendly debug environment can save a lot of time. In this respect, VB.NET and SQL are two extremes, that the former provides almost a perfect Debug environment, while the latter nearly provides no error debugging tool.

As powerful statistical computing tools, R and esProc support debug to save time and efforts for debugging, and brings great convenience for business experts or analysts.

R language and esProc as two development tools for computation and analytics are both capable to debug to some degree. We will study on their differences in this respect.

Let's kick off by making ourselves familiar with the debugging environments of both R (take RStudio for example) and esProc from their respective interfaces:

RStudio Debugging Environment

RStudio%20Debugg.png


esProc's Debugging Environment

RStudio%20Debugg.png


Let's compare the basic functions.

Break point: For R, the break point is set by inserting brower() into the codes. Users have to remove these statements manually once debugged, which brings us back to the cherished old times of using BASIC to code when Windows was not invented, impressed us with a strong feeling of reminiscence. In those days, removing the stop breakpoint statement is even an important job before releasing. By comparison, the breakpoint style of esProc is similar to that of VB.NET and other alike modern programming languages. By clicking the button or pressing shortcut keys, the breakpoint can be set to the cell in which the mouse cursor is located. This is a common style nowadays with no challenge or interests to anyone.

Debug command: with the same style of break point, debug command of R is input from the console, including c to resume running, n to go run the next statement, and Q to exit the debug mode. In addition, there are also functions like trace /setBreakpoint/debug/undebug/stop. It is important to note that it would be best not to have any variable named after c, n, and Q in the codes. Otherwise, accidental conflicts will occur.

Regarding the procedure control, esProc is no different than VB.net and alike programming languages for just requires click(s) on button or shortcut keys to implement, not requiring users to memorize any command, as we all know.

Variable watch: The variable watch window of R is on the right, in which all current variables will be listed. On clicking it, a new window will prompt to display the value of this variable. Alternatively, R users can also enter the fix(variable name) at the command line window as shown below. In the right bottom corner of esProc user interface, there is a similar variable list. Seldom do esProc users use this list because esProc does not require users to specially define the variable name. The name of cell is taken as the variable name by default, and thus users can simply click the cell to view the variable value.

One thing to notice is that R is friendly to display the variables of Frame format. However, it is comparatively not so friendly to support the irregularly-structured variables that we can say it is unreadable at all, as the below typical List for example:

RStudio%20Debugg%202.png


esProc does a much better job in this respect. For the same data, in esProc, it is represented by drilling through the hyperlinks:

esProc%20debug%202.png


Then, let's compare some more advanced functions, and start from checking the Immediate Running first.

As for esProc (download), a cell will be calculated immediately and automatically once codes are entered into this cell. Therefore, the developers can view the execution result immediately and adjust the code for re-run on conditions. This style can speed up the development speed and lower the probability of errors, allowing the green hand to become familiar with it quickly. RStudio provides the similar means that more resembles the "immediate window" of VB, that is, users type in codes at a command line window and run immediately. If it is run correctly, then copy the codes to the formal code section. Judging on the whole, R is less convenient than esProc in this respect.

Finally, let us discuss the function to debug the functions separately.

R users can use the debug(Function Name) to debug the functions separately and directly so as to modularize in development and implement the large-scale test. esProc users, on the contrary, are not allowed to debug the function separately, which is a pity more or less. However, the debug function of R has not implemented the true "separate" test. Its working principle is actually to add a browser() command prior to the function to be debugged, still requiring running all codes before entering the function to debug.

From another perspective, such computational analysis software is rarely used for the large-scale development and test. There is not much significance and value for its ability to debugging function separately.

Through the above comparison, we can see that both R and esProc provide the powerful enough debugging function. In which, R is better in debugging the function separately, and esProc is more convenient and easier to use.

Author: Jim King

BI technology consultant for Raqsoft

10 + years of experience on BI/OLAP application, statistical computing and analytics

Email: [email protected]

Website: www.raqsoft.com

Blog: datakeyword.blogspot.com
 
As is well known, in the development process of a program, the time consumed to remove and correct the error is usually greater than time spent on coding. Therefore, a friendly debug environment can save a lot of time. In this respect, VB.NET and SQL are two extremes, that the former provides almost a perfect Debug environment, while the latter nearly provides no error debugging tool.

As powerful statistical computing tools, R and esProc support debug to save time and efforts for debugging, and brings great convenience for business experts or analysts.

R language and esProc as two development tools for computation and analytics are both capable to debug to some degree. We will study on their differences in this respect.

Let's kick off by making ourselves familiar with the debugging environments of both R (take RStudio for example) and esProc from their respective interfaces:

RStudio Debugging Environment

RStudio%20Debugg.png


esProc's Debugging Environment

RStudio%20Debugg.png


Let's compare the basic functions.

Break point: For R, the break point is set by inserting brower() into the codes. Users have to remove these statements manually once debugged, which brings us back to the cherished old times of using BASIC to code when Windows was not invented, impressed us with a strong feeling of reminiscence. In those days, removing the stop breakpoint statement is even an important job before releasing. By comparison, the breakpoint style of esProc is similar to that of VB.NET and other alike modern programming languages. By clicking the button or pressing shortcut keys, the breakpoint can be set to the cell in which the mouse cursor is located. This is a common style nowadays with no challenge or interests to anyone.

Debug command: with the same style of break point, debug command of R is input from the console, including c to resume running, n to go run the next statement, and Q to exit the debug mode. In addition, there are also functions like trace /setBreakpoint/debug/undebug/stop. It is important to note that it would be best not to have any variable named after c, n, and Q in the codes. Otherwise, accidental conflicts will occur.

Regarding the procedure control, esProc is no different than VB.net and alike programming languages for just requires click(s) on button or shortcut keys to implement, not requiring users to memorize any command, as we all know.

Variable watch: The variable watch window of R is on the right, in which all current variables will be listed. On clicking it, a new window will prompt to display the value of this variable. Alternatively, R users can also enter the fix(variable name) at the command line window as shown below. In the right bottom corner of esProc user interface, there is a similar variable list. Seldom do esProc users use this list because esProc does not require users to specially define the variable name. The name of cell is taken as the variable name by default, and thus users can simply click the cell to view the variable value.

One thing to notice is that R is friendly to display the variables of Frame format. However, it is comparatively not so friendly to support the irregularly-structured variables that we can say it is unreadable at all, as the below typical List for example:

RStudio%20Debugg%202.png


esProc does a much better job in this respect. For the same data, in esProc, it is represented by drilling through the hyperlinks:

esProc%20debug%202.png


Then, let's compare some more advanced functions, and start from checking the Immediate Running first.

As for esProc (download), a cell will be calculated immediately and automatically once codes are entered into this cell. Therefore, the developers can view the execution result immediately and adjust the code for re-run on conditions. This style can speed up the development speed and lower the probability of errors, allowing the green hand to become familiar with it quickly. RStudio provides the similar means that more resembles the "immediate window" of VB, that is, users type in codes at a command line window and run immediately. If it is run correctly, then copy the codes to the formal code section. Judging on the whole, R is less convenient than esProc in this respect.

Finally, let us discuss the function to debug the functions separately.

R users can use the debug(Function Name) to debug the functions separately and directly so as to modularize in development and implement the large-scale test. esProc users, on the contrary, are not allowed to debug the function separately, which is a pity more or less. However, the debug function of R has not implemented the true "separate" test. Its working principle is actually to add a browser() command prior to the function to be debugged, still requiring running all codes before entering the function to debug.

From another perspective, such computational analysis software is rarely used for the large-scale development and test. There is not much significance and value for its ability to debugging function separately.

Through the above comparison, we can see that both R and esProc provide the powerful enough debugging function. In which, R is better in debugging the function separately, and esProc is more convenient and easier to use.

Author: Jim King

BI technology consultant for Raqsoft

10 + years of experience on BI/OLAP application, statistical computing and analytics

Email: [email protected]

Website: www.raqsoft.com

Blog: datakeyword.blogspot.com
Brief Review: Subtable Grouping in SQL vs esProc


Grouping subtables—nested or related records—can be complex in SQL, especially when aggregating data tied to a primary table. For instance, counting cities where each employee has worked over a year involves nested grouping and filtering across staff and resume tables.


In SQL, this requires:


  • A subquery joining and grouping by name and city.
  • A HAVING clause to filter by workingDays >= 365.
  • An outer query to count qualifying cities per employee.

SELECT name, COUNT(*) cityCount <br>FROM (<br> SELECT staff.name, resume.city <br> FROM staff, resume <br> WHERE staff.name = resume.name <br> GROUP BY name, city <br> HAVING SUM(workingDays) &gt;= 365<br>) <br>GROUP BY name;<br>

In esProc, subtable processing is simplified by treating subtables as fields within the main table:


=staff.new(name, resume.group(city).count(~.sum(workingDays)&gt;=365):cityCount)<br>

This approach offers a declarative and intuitive method to group and filter subtables directly, avoiding the complexity of nested SQL queries.


Conclusion: esProc provides a more straightforward, field-centric way to handle subtables, while SQL requires careful structuring and multiple groupings.
 
Back
Top